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RESEARCH PROBLEMS RAISED IN RECENT ISSUES 
OF EDUCATIONAL PERIODICALS 


LAURA ZIRBES 
The Lincoln School of Teacher’s College 


There are numerous indices of the present trend in educational 
research. While no one index is sufficiently representative, a number 
of careful searches in well chosen but restricted areas should reveal 
not only the trend of research, but a series of unsolved problems. 
The solution of such problems is often a significant contribution to 
scientific progress and educational practice. They are most often 
formulated as a result of and in connection with studies published in 
periodicals, books and monographs, in the pursuit of research and 
inquiry in graduate schools and bureaus of research, in experimental 
schools and in oral utterances of educational leaders. They seldom 
stand out conspicuously, but are more often mentioned incidentally 
in connection with the presentation of a practical situation which calls 
for their solution. This article is one of a series which will, from time 
to time, seek to rescue such statements of problems from oblivion, in 
the hope that re-statement, frequency of occurrence and accessibility 
will combine to define the trend and, by speeding the attack, hasten 
their solution. 

The search in this instance was confined to the three most recent 
issues of the following periodicals, except as noted:! 


1. Educational Administration 3. The Journal of Applied 
and Supervision Psychology 

2. The Elementary School 4. The Journal of Educational 
Journal Psychology 


1 The search was confined to autumn issues of 1921 because several publications 
have no summer numbers and the inclusion of materials appearing on widely 
separated and non-contiguous dates, and consequently not within the limits of 
single volumes, would be less significant for the purpose of defining a trend. 
Several carefully limited searches will serve that purpose better than one more 
inclusive investigation. 
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5, The Journal of Educational 8. School and Society (10 
Research issues ) 

6. The Psychological Bulletin 9. The Teachers College Record 

7. The School Review (1 issue) 


Periodicals will be referred to by the numbers assigned in this list 
and page numbers refer to the current volume (1921). These are 
given so that those who are in a position to conduct research may 
readily find the original statement of any problem in its setting. 
Differences in form of statement are due to an attempt to retain the 
original phrasing whenever possible. 


I. PRoBLEMS REFERRING TO MENTAL TESTS 


Studies of the constancy of the IQ. 6, p. 339, 341, 342; 4, p. 
314, 323. 

To what extent is mental age valuable for prognosis of pupil 
achievement in the lower grades? 4, p. 383. 

There is need for the validation of Stanford norms, the revision 
of tests and administrative procedure on the basis of a large number 
of unselected children, by a group of disinterested psychologists. 
4, p. 400. 

Why is there a progressive increase in overlapping evident in 
individual test (Binet) results and absent in the case of group tests? 
4, p. 405. 

Comparison, analysis and evaluation of group tests of intelligence. 
6, p. 342. 

Need for determining the reliability and validity of tests and 
methods used in the interpretation of results. 5, p. 136. 

Adequate determination of the total distribution of abilities in 
adolescence because school groups show effect of cumulative selection. 

A number of sets of norms obtained from representative groups of 
cases. 3, p. 70. 

Critical investigation of the concept ‘‘general intelligence”? and 
investigation regarding the specificity of prognosis problems. 3, p. 76. 

Investigation and qualitative examination of test responses in 
individual cases where results of retest show great inconstancy of IQ. 
3, p. 158. 

Investigation of the relation of mild social and physical maladjust- 
ment of superior individuals, to marked intellectual achievement. 
8, p. 425. 























Research Problems 3 


Intensive scientific study of sixty or more carefully selected infants 
under controlled conditions and through a number of years to deter- 


mine educability due to heredity; racial differences; health controls. 
8, p. 312. 


II. CurRICULUM STUDIES 


Scientific determination of curricula in history and allied subjects. 
5, p. 294; 7, p. 617; 7, p. 573; 8, p. 386. 

Reconstruction of college curricula in terms of distribution of 
abilities and individual differences. 8, p. 389; 8, p. 437. 

Experimental study to show how the double demand for assured 
values of racial experience and the proper utilization of children’s 
purposes may best be met. 9, p. 287; 9, p. 289. 

Curriculum research in chemistry. 7, p. 646; 8, p. 220. 

Statement of objectives in reading and experimental determination 
of materials to accord with objectives. 7, p. 573. 

Mathematical curricula for high schools based on investigation of 
socially valuable relations between quantities, and experimental data 
concerning training in the ability to think in terms of quantitative 
data of proven social significance. 7, p. 646. 

Curriculum reconstruction in geography. §8, p. 437. 

Vocabulary studies to ascertain the relative importance and signifi- 
cance of words of foreign derivation. 9, p. 368. 


How may health instruction be co-ordinated with other subjects 
of the curriculum. 2, p. 41. 


III. Stupirs PERTAINING TO ADMINISTRATION AND SUPERVISION 


Construction of scientific instruments and methods that will aid in 
diagnosis of teaching and supervisory process and the selection and 
improvement of teachers. 5, p. 83; 3, p. 39. 

Experimental investigation of effect of textbooks on outcomes of 
instruction. 4, p. 342; 9, p. 358. 

Necessity for studying the effect of various marking systems and of 
selecting a system with reference to objectives and function in control- 
ling instruction. 7, p. 510. 

What is the maximum age and ability range of an effective group? 
4, p. 342. 


An application of statistical method in the analysis of factors of 
teaching success. 5, p. 89. 




















4 The Journal of Educational Psychology 


Suggestions as to necessary refinement of survey methods and 
interpretations. 1, 433. 

Experimental evaluation of classification schemes: horizontal and 
graded versus vertical and parallel. 2, p. 71. 

Experimental investigation of improvement in the quality of 
instruction due to controlled causes. 8, p. 469. 

Occupational descriptions of university positions as one possible 
means for the improvement of instruction in universities. 8, p. 293. 


IV. EpuUcATIONAL TEST PROBLEMS 


Reformulation and analysis of silent reading problems as next steps. 
4, p. 304; 4, p. 350; 6, p. 350; 8, p. 211. 

Does the function represented by the Thorndike-McCall Reading 
Test yield to practice or training or is it one which develops primarily 
as a result of inner growth? 4, p. 384. 

Research leading to the improvement of reading tests. 4, p. 464. 

Need for additional forms of Gray Tests. 4, p. 381. 

Necessity for determining reliability and validity of subject tests 
in order to interpret and use results wisely. 5, p. 136. 

Vocabulary checks on material in standard tests by use of Thorn- 
dike’s Wordbook. 9, p. 368. 


V. LEARNING STUDIES 


What are the methods employed by children in the gradual 
acquirement of the power of reading numerals? 4, p. 365. 

Review of scattered data and experimental and statistical data to 
reveal to what extent various types of specific training with words 
increase vocabulary. 4, p. 456. 

To what extent would an experimental evaluation of ‘‘ piece-meal 
learning”’ affect curriculum research? 4, p. 474. 

Experimental determination of the educational significance of 
individual differences upon which differentiation in curricula can be 
based. 5, p. 151. 

Need for development of tests of pupils’ use of economical and 
desirable methods of study, not to show results of study but habits of 
study. 7, p. 706. 

Case studies to determine whether inefficient work is due to specific 
present disability or intelligence limitations. 5, p. 292. 

Experimental study of how children learn to pronounce. 2, p. 182. 








aaa Res, 
P= ar. 
ey) ‘ 
Vast 2 a ae 
: - 


tote ley st Se A 


Se es: 
— — 
no 
, 5 - 7 -~ 


Research Problems 5 


Analysis of silent reading progress with studies of the comparative 
difficulty of materials in different successive grades. 2, p. 146. 
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VI. RATING-SCALE PROBLEMS 
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Methods of discovery and encouragement of college students of 
superior ability and promise of achievement in research. 8, p. 439; 
8, p. 239; 8, p. 254. 
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VII. NEED FoR STATISTICAL DEVICES AND METHODS 


A method which permits study of the relationship of mental test 
scores and achievement in a simple direct and specific way. 3, p. 77. 
A valid method for computing promotion rates. 5, p. 309. 


In addition to the problems succinctly stated there are numerous 
evidences of the need of more adequate data to support conclusions and 
of a more careful definition of terms and usages. 

It is often difficult to ascertain whether “diagnosis” refers to 
individual cases, to general class conditions or to larger units of organi- 
zation. 

“‘Analysis”’ is sometimes contemplated with reference to function 
and at other times to phases of end results, test scores, or pupil traits. 
The fact that the meaning attaching to these two terms is so often 
inadequately stated is, no doubt, one indication that new meanings 
are beginning to attach to their use. 

Summary.—The three specific problems most frequently mentioned 
in the area to which this search was limited are: 

Studies of the constancy of the IQ (5).! 

Reformulation and analysis of silent reading (4). 

Scientific determination of curriculum in social studies (4). 

The fact that all of these problems are well represented in the 
titles of present contributions is an indication that they have already 
been attacked and are urgent because the need for further study is so 
frequently mentioned. No doubt thespecific problems are more clearly 
defined as one result of pioneer studies in any field. 

Frequency is not taken as a valid measure of significance in this 
connection. Pioneer studies in any field are the precursors of future 
trends. If statements of unsolved problems raised in the course of 


1The total frequency of mention in the thirty-two magazines. 
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investigation were available in connection with the listed conclusions 
now so frequently found in reports, this exploratory function of 
pioneer studies would be more adequately served. 

The classification of problems was not seriously hampered by over- 
lapping and it is only necessary to count the references under each 
heading to get an idea of the relative frequencies of the problems thus 
grouped: Mental Tests 16; Curriculum Studies 15; Administration and 
Supervision 11; Educational Tests 9; Learning Studies 8; Rating Scales 
3; Statistical Devices 2. We are led to wonder whether the paucity of 
studies in new statistical methods and devices is due to too great a 
reliance on some of those now so uncritically accepted, or whether 
some of our creative thinkers along this line are so deeply involved in 
carrying forward some specific line of research, that there is insufficient 
opportunity for the mental manipulation of relations which often 
leads to contributions of high order. 

The leadership of educational writers, whose recent contributions 
were the material read in this search, gives added significance to the 
list of problems thus assembled. | 

A tabulation of the ‘‘Contents”’ of the same periodicals would give 
an interesting exhibit of the recent results of research, but that is not 
within the scope of this article. 

In a recent article in the Journal of Applied Psychology, Terman 
gives quantitative data on the shift in the trend of psychological 
research, which has become increasingly noticeable since 1900. Of the 
researches in whick the 306 members of the American Psychological 
Association are at present engaged, only 48.5 per cent are in pure 
psychology while 51.5 per cent are in applied fields. This develop- 
ment is in agreement with that in other sciences, and is an indication 


of the increasing ‘‘ value’”’ placed on psychology as a factor in thesolu- 
tion of human problems. 














GROUP WILL-TEMPERAMENT TESTS 
M. J. REAM 


Bureau of Personnel Research, Carnegie Institute of Technology 


In predicting success in school subjects, or success in specific voca- 
tions, the limitations of intelligence tests are recognized by their most 
enthusiastic advocates. The low correlations of test scores with 
success prove that there are other significant traits. 

The Downey scale’ of individual will-temperament tests is an 
attempt to bring to light some of these other factors. While individual 
testing presents many subtle observations not possible with group 
tests, yet a group method of giving the test was necessary if the 
Downey scale were to be included in a comprehensive study of success- 
ful and unsuccessful salesmen conducted at the Carnegie Institute of 
Technology. Accordingly, the Downey scale was modified at the 
Carnegie Bureau of Personnel Research so that groups of subjects 
might be tested at one time. 

This series of group tests has been given during the past two years 
to 500 insurance salesmen, 600 Freshmen at the Carnegie Institute of 
Technology, and 150 stenographers, typists, and comptometer opera- 
tors at a technical night school. The writer has carried on the evalua- 
tion of the test for the insurance salesmen only. Production records 
were available for about 125 of these salesmen. Some parts of the 
test have proved to be of positive value and are now included in the 
selection program for insurance salesmen prepared at the Carnegie 
Bureau of Personnel Research. 

This group test, as here presented, follows the Downey scale rather 
closely in the test situations presented. Handwriting is used in 
eight of the eleven parts of the test, but none of the usual assumptions 
of graphology are made; only the changes in handwriting under the 
controlled experimentation of this test are considered in the results. 

The group test differs from the Downey test in two or three import- 
ant respects. First, the work-limit is changed to a time-limit basis, 
the necessity for which is obvious in group testing. Second, the 
giving of the test is less subjective, is less dependent on the examiner’s 
personality and technique. The scoring of the test is objective and 
quantitative, which is essential if the test is to be used in the com- 


1 Downey, J. E.: The will-profile. A tentative scale for measurement of 
the volitional pattern. Department of Psychology, Bull, No. 3, University of 
Wyoming, 1919. 
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mercial field and handled by psychologically untrained examiners. 
Third, the group test is much shorter. It can be given in thirty 
minutes. 


Tue PARTS OF THE TEST 


In describing the parts of the test, reference is made in each case to 
the name of the Downey individual test from which each part has been 
adapted. The name and the Downey definition appear in parentheses. 
Reference to these already familiar names will aid in identification. 
But whether these tests actually measure the traits indicated is not 
the concern of this article. Sufficient for the present purpose is the 
fact that scores in the tests show a relationship to successful sales work. 

Parts 1 and 10. (Speed of movement. This test is intended to 
measure “normal speed of movement relative to size of person and 
age.’’) 

The directions for Parts 1 and 10 of the group test are as follows: 

In the space below, copy the words ‘‘ United States” as you usually 
write them, in your usual style and at your usual speed. Copy the 
words repeatedly until you are told to stop. You do not need to 
hurry. Wait for the signal before beginning. 

The time allowed is thirty seconds. One might expect considerable 
variation in speed of writing under different physical conditions. To 
counteract in a measure such variation, Part 1 is repeated as Part 10 
near the end of the test. The score is the average number of letters 
written in the two parts. 

Part 2. (Motor inhibition. The ability to retard writing is 
measured, which is related to ‘‘ capacity to keep in mind a set purpose 
and achieve it slowly.’’) 

The directions for Part 2 are: 

Write each of the words as slowly as possible on the line after 
each word. Write the words ASSLOWLY AS YOU POSSIBLY CAN 
and still keep the pencil moving. Do not enlarge your writing. Wait 
for the signal before beginning. 

This test is given three times. Two repetitions are necessary to 
make some subjects comprehend that writing just as slowly as possible 
is really wanted. Only the last repetition, for which sixty seconds are 
allowed, is scored. The score is the number of letters written. 

Part 3. Speed of decision in choosing better traits. This test con- 
sists of an elaboration of the Downey list of opposite traits. Samples 
are “careful-careless,’’ ‘‘slow-quick,’’ “ gloomy-cheerful.”’ 
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The directions are: 

Check the ONE trait in each pair which is the BETTER in most 
circumstances. For example, it is better to be careful than careless in 
most circumstances. So, careful should be checked in the first pair. 
Do not skip any pairs. 

Forty-five seconds are allowed and the score is the total number of 
pairs of traits checked. The purpose of this part is merely to provide 
a basis for comparison with speed of decision in traits which apply to 
the subject personally. This second test of speed of decision appears 
later in the series. 

Part 4. (Freedom from inertia. The ratio of the rate of normal 
writing to the rate of speeded writingisdetermined. Thisisintended to 
measure “‘quickness in warming up and tendency to work at one’s 
highest speed without external pressure. ’’) 

The directions for Part 4 are: In the spaces below write the words 
“United States” as quickly as you possibly can and still have the writing 
legible. Continue until you are told to stop. Wait for the signal 
before beginning. 

Sixty seconds are allowed. In scoring, the average number of 
letters written in Parts 1 and 10 above, is divided by the number of 
letters writen in Part 4. The resulting decimal is the score for Part 4. 

Part 5. (Motor impulsion. Motor impulsion or the “‘tendency to 
impetuosity and energy of reaction is measured by the magnification 
and increased speed of writing under distraction.”?) In Part 5, 
visual control over writing is first eliminated by having subjects write 
the phrase “‘ United States” without looking at the paper. This is 
repeated once. Next, the subjects must write the phrase repeatedly 
while counting the number of taps which the examiner makes on the 


table. The examiner diverts the attention of the subjects from the 


writing by tapping loudly and irregularly. The subjects must watch 
the examiner, not their papers. Two series of such writing under 
distraction are given, each lasting twenty-five seconds. The size of 
this writing and the speed, measured by the number of letters written, 
are compared with normal size and speed, measured in Parts 1 and 10. 
The resulting ratios added together give the score for this part. 

Part 6a. Success in disguise. (Flexibility. Versatility in dis- 
guising one’s handwriting is suggested as ‘‘characteristic of the his- 
trionic or fluidic temperament.”?) An unpublished study of the 
students of dramatic art at the Carnegie Institute of Technology 
gives some evidence in support of this suggestion. 
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The directions for Part 6 are: Write the words ‘‘ United States” in 
the space below trying to disguise your handwriting in as many ways 
as possible and as much as you can. Try out any disguise you can 
think of but do not print. Take as much time as you need and copy 
the words as many times as necessary. Keep trying until you feel 
that you have made a copy that even a handwriting expert could not 
identify as yours. 

Three minutes are allowed for Part 6. The score is the number of 
different disguises accomplished. A sample scoring scale has been 
provided to facilitate the scoring of this test. 

Part 6b. Attempts at disguise. (Volitional perseveration. This is 
defined as “persistence in attaining an indefinitely defined end.’’) 
Many subjects do not continue to work the entire time allowed for 
Part 6. Since the number of attempts at disguise varies, Part 6 is also 
used as the group modification of the individual test of perseveration 
which is the length of time taken for one disguise. Part 6 in this case 
is scored according to the number of attempts to disguise the writing 
of the phrase. 

Part 7. (Care for detail. Accuracy in copying samples of hand- 
writing is intended to measure “attention to details.’”?) The directions 
for Part 7 are: Imitate the model sentences AS EXACTLY AS YOU 
CAN. Take as much time as you need. Wait for the signal before 
beginning. 

Ninety seconds are allowed for this part. It is scored by a stencil 
in which inaccuracies are specifically indicated. Only two sentences 
are scored. The maximum number of errors is thirty. 

Part 8. (Coordination of impulses. This part is a group form of 
the Downey test of writing in restricted space. It is intended to 
measure ‘‘capacity to handle a complex situation without forgetting 
either factor involved.’’) 

The directions for Part 8 are: Copy each of the sentences below 
as rapidly as possible on the line after each sentence. Be careful not 
to let the writing extend beyond the end of the line. Remember, 
you are to write the sentences as quickly as you possibly can. Wait for 
the signal before beginning. 

The sentences become increasingly difficult to compress in the 
limited spaces provided. Forty-five seconds are allowed. The score 
is the total number of letters written on the short lines. Letters 
which extend beyond the lines are not counted. 

Part 9. (Speed of decision. The individual test is intended to 
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measure ‘‘quickness in reaching a decision or conclusion.’’) Part 9 
presents the same list of paired traits as Part 3 but in this case there is 
a personal reference. The directions for Part 9 are: Check the ONE 
trait in each pair which describes YOU better. You may be in doubt 
in some cases; for example, you may be careful in some things and 
careless in others, but as a general rule you are more often one than 
the other. Check that trait. Do not skip any pairs. It will make 
no difference if you do not finish the whole list; speed does not count. 

Sixty seconds are allowed and the score is the total number of 
pairs of traits checked. In this part, the decision called for is “‘sub- 
jective,’”’ involving self-judgments. A number of persons otherwise 
quite rapid in their reactions, show considerable blocking when check- 
ing these personal traits. The test is intended to measure ease and 
rapidity of decision in subjective items, and the tendency not to be 
critical. To throw this tendency into greater relief the score on this 
part is compared with the score made on Part 3 and a ratio computed. 
The score thus determined is treated as a measure of self-consciousness, 
on the thesis that the highly self-conscious individual will be propor- 
tionately slower in making subjective, personal judgments than in 
making non-personal decisions. 

Part 11. (Assurance. This test is intended to measure the “‘ degree 
of confidence with which one maintains his opinions agginst contradic- 
tion.””) In this part a chart which contains Arabic and Roman 
numerals and small and capital letters, nine characters in all, is 
exposed to view for ninety seconds, after which the chart is withdrawn. 
The test blank contains fifteen statements about the chart, which 
are to be marked TRUE or FALSE, and doubly underlined if the 
subject feels especially sure of his answer. A final paragraph contra- 
dicts the actual conditions of the chart as follows: If you finish before 
time is called, you may check your accuracy in the last three state- 
ments. The word FALSE should be underlined after statements 13, 
14, and 15. If you have not done this, you are at liberty to change 
your answers. 

This suggestion is false as regards statements 14and 15. Thescore 
on this part is based on the subject’s resistance to suggestion and the 
number of answers which he doubly underlines. Part 11 has no time 
limit. The test papers are collected as soon as the subjects finish 
this part. It will be noticed from this description of the parts of the 
group test that all the items of the Downey individual test are retained 
in modified form except Resistance to Opposition, and Revision. An 
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added item is the ratio between the two checkings of traits, which is 
intended to show the effect of self-consciousness. 


RELATIONSHIP BETWEEN Group TEST AND Downey INDIVIDUAL 
TEstT 


In order to determine the correspondence between the scores and 
volitional patterns obtained from the group test and those obtained 
from the Downey individual test, a group of persons was tested, first 
with the group test and later tested individually with the Downey test. 
The subjects were twenty-one men (Juniors in the School of Industries 
at Carnegie) and two young women, making twenty-three subjects in 
all. The testing was done by four assistants who were trained in the 
method of giving the Downey individual test. 

The scores on the group and individual parts of the ‘test were 
correlated with the following results: 


TABLE I 

Parts 1 and 10, group test with Speed of movement, individual test...... 0.72 
Part 2, group test with Motor inhibition, individual test................ 0.55** 
Part 3, group test with Speed of decision, individual test............... 0.42 
Part 4, ratio, group test with Freedom from inertia, individual test... ... 0.05 
Part 5: 

Magnification ratio, group test with Motor impulsion, individual test.. 0.54* 

Speed ratio, group test with Motor impulsion, individual test......... 0.42* 

Summed ratios, group test with Motor impulsion, individual test...... 0.50* 
Part 6a, Success in disguise, group test with Flexibility, individual test..... 0.12* 
Part 6b, Attempts, group test with Perseveration, individual test........ 0.90** 
Part 7, group test with Care for detail, individual test . iiecces CT? 
Part 8, group test with Coordination of impulses, individual dent... en pe 0.16* 
Part 9, group test with Speed of decision, individual test ............... 0.46 
Part 11, group test with Assurance, individual test .................... 0.42* 


* Rank correlation formula used. 
** Correlation ratio used, formula for non-linear distribution. 


The results show relatively high correlations with those correspond- 
ing individual tests which are scored entirely objectively and quanti- 
tatively. These individual tests are: Speed of movement, motor 
inhibition, speed of decision, freedom from inertia, perseveration, and 
care for detail. The only low correlation in this group is that between 
Part 4—ratio and freedom from inertia. Scores in both of these tests 
are ratios derived from raw scores. Since ratios do not show a bell- 
shaped distribution, a high correlation is hardly to be expected. 
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Low positive correlations are found with those parts which are 
qualitatively scored or subjectively combined in the individual test— 
coordination of impulses, motor impulsion, assurance and flexibility. 

The group and individual tests revealed volitional patterns of the 
same general type for each subject. There were occasional breaks 
in the correspondence but these may have been partly the result of 
repetition of certain parts in the second testing. This was particularly 
marked in the checking of traits. Two-thirds of the subjects raised 
their decile standings on this part in the later test, while only one 
subject lowered his decile standing. In view of all the results of this 
experiment it is evident that the group test is a fairly satisfactory 
approximation of the Downey individual test. 

An incidental study was made of the test names and definitions 
published by Professor Downey. Scores and standings in the group 
test were reported to one group of thirty-five salesmen, together with 


TaBLE IJ.—ReEacTIONS OF THIRTY-FIVE SUBJECTS TO STANDINGS IN TRAITS. 
REPORTED FROM ReEsvuLts oF GROUP WILL-TEMPERAMENT TESTS 
AND OTHER TESTS 























| Per Per Per Per 
Test | cent cent cent cent 
| right wrong | too high| too low 
Parts 1& 10 (Speed of movement).........| 69.0 31.0 14.0 17.0 
Part 4 ratio (Freedom from inertia).......; 93.0 7.0 7.0 0.0 
Part 9 (Speed of decision)............... 93.0 7.0 0.0 7.0 
Part 6—Disguises (Flexibility)........... 90.0 10.0 3.0 7.0 
Part 5 (Motor impulsion)................| 74.0 25.0 3.0 22.0 
EET Tre 80.0 | 20.0 . 13.0 7.0 
Part 9 ratio (Freedom from self-conscious- | | 
Ges ek eae iene ck 70.0 30.0 | 12.0 | 18.0 
Part 2 (Motor inhibition)................ 88.0 | 12.0 | 9.0 | 3.0 
Part 7 (Care for detail)..................| 738.0 27.0 | 15.0 12.0 
Ps Soho aw TRUS Ci oon oni 81.1 18.8 8.4 10.3 
Other tests: 
iia 6 neha Nd hie an ee 81.0 19.0 0.0 19.0 
Meeting objections...................... 84.0 16.0 3.0 13.0 
Business information.................... 73.0 27.0 9.0 18.0 
DOE 5, ikit $46 as oF MEY Wen vein 79.3 20.6 4.0 16.6 
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standings in other tests—intelligence, meeting objections and business 
information. Each person in the group was given a chart containing 
the Downey test names with his standings in the corresponding parts 
of the group test graphically presented. The Downey descriptive 
account of each test was read, after which each subject was asked to 
mark whether his standing indicated in the graph was correct, too 
high, or too low, according to his own judgment of himself concerning 
the various traits. 
The results of these self ratings are shown in Table II. 
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Fig. 1.—Predictive value of group will-temperament tests with forty-seven 
salesmen in first insurance school. X—Composite score in tests. Y—Success in 
selling insurance. 


Tests 


On the surface the results appear rather striking, but sources of 
error immediately suggest themselves. The high percentage marked 
“right”? is without doubt partly the result of suggestion since many 
people have never attempted to analyze their own traits, much less to 
rate them. A self rating made before the test results are presented 
would be a better check. Nevertheless, it is worthy of note that the 
will-temperament tests received just as high a percentage of correct 
ratings as the non-volitional tests—intelligence, meeting objections, 
and business information. 
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RESULTS 


The selling records of the salesmen were used as a criterion to 
determine the value of this series of group tests. During an eleven 
weeks course of instruction each salesman was required to spend each 
afternoon in actual field selling. The territorial as well as the time 
conditions were uniform. At the end of this period, standings were 
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Fig. 2.—Predictive value of group will-temperament tests with seventy-five 
salesmen in second insurance school. X—Composite score in tests. Y—Success 
in selling insurance. 


assigned on the basis of number of cases and amounts sold. The 
amount of insurance sold ranged from none at all to $140,000.00. 
The men were placed in five groups, from the entirely unsuccessful 
group to the most successful group. The tests were evaluated with 
this criterion. 

The accompanying charts show the discriminating value of this 
series of tests with two separate groups of salesmen. Scores in the 
tests were statistically weighed and combined into a single composite 
score for each salesman. Median composite scores for each group are 
shown in the following table: 
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Test III 





Median composite 
score 





Group A | Group B 


Successful salesmen: 
EE Bohs Srna sles oS RR AAD RA ee ee 4 
ee ar al os a led dd Woo Kae ares 4 
ee a kee 3 

2 
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Doubtful salesmen 
Unsuccessfu) salesmen 
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Some tests showed upper or lower critical scores. Other tests 
showed discriminative value only when combined with another test 
in a three variable scatter diagram. The most useful tests for select- 
ing salesmen are: Part 6 (success in disguise), Part 9 (speed in checking 
personal traits), Part 5 (magnification and speed of writing under 
distraction), and Part 9 ratio (speed of checking traits in Part 9 
relative to speed in Part 3). The tests of least value are: Part 11 
(resistance to contradiction), Parts 1 and 10 (normal speed of writing) 
and Part 4 (ratio of normal to speeded writing). 

This series of group will-temperament tests proved itself of suffi- 
cient value to be included in the Bureau of Personnel Research selection 
program forsalesmen. Results from these tests are much more signifi- 
cant for sales work than intelligence test results. The group tests are 
used in connection with an evaluation of previous training, experience, 
and a study of interests pertinent to sales work. A valuable instru- 
ment for use in vocational selection has resulted. 


CONCLUSIONS 


1. The series of group tests approximate fairly closely the results 
obtained from the Downey individual will-temperament test. 


2. The tests are of positive value in predicting success in selling 
insurance. . 











COMPARATIVE VARIABILITY AT DIFFERENT AGES 
V. A. C. HENMON 


University of Wisconsin 
AND 
W. F. LIVINGSTON 


Stoughton, Wisconsin 


There is a widespread belief which frequently finds expression 
in the literature of education, that individual differences are greater 
during adolescence than at any other time in life and that the develop- 
ment from childhood to adolescence is not gradual but saltatory. 
G. Stanley Hall is the chief proponent of this doctrine. Concerning 
the adolescent period he says; ‘‘The human plant circumnutates 
in a wider and wider circle, and the endeavor should be to prevent it 
from prematurely finding a support, to prolong the period of variation 
to which this stage of life is sacred. . . .”! ‘The possibility of 
variation in the soul is now at its height.’”’?  ‘‘ The forces of growth now 
strain to the uttermost against old restrictions. It is the age of bath- 
mism, or more rapid variation, which is sometimes almost saltatory.’’ 
‘ “Individual differences of all kinds are now suddenly augmented.’’4 

“The range of individual differences and average errors in all physical 
measurements and all psychic tests increases.’’® 
This theory has important practical applications and itsinfluence 
is plainly visible in our present systems of school organization. The 
contention that youth is the period of great fluctuation and that 
therefore throughout the high school age there is a decided increase 
in variability in all mental functions implies that the secondary school 
should provide a wider range of elections in the curriculum, smaller 
classes and more individualization of instruction, and greater versa- 
tility in methods of presentation of subject matter in order to appeal 
to the widely varying characteristics of a high school class. On the 
other hand (and this is the more serious consideration), the implied 
greater similarity between children in the grades offers an excuse for 
larger classes, for poorer teachers, and for forcing all pupils through 


1 Hall, G. Stanley: “ Adolescence,” Vol. II, p. 88. 
2 ibid.: p. 89. 
3 ibid.: p. 90. 
‘ibid.: p. 363. 
5 Hall, G. Stanley: ‘‘ Youth,” p. 6. 
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the same process and by the same methods until the approach of 
adolescence. 

While thousands of children have been tested in a great variety of 
mental and physical traits, the writers have been unable to locate any 
systematic review of the available evidence on this supposedly greater 
variability during adolescence. It is a curious fact that in spite of the 
importance of the theory, no one appears to have taken the trouble 
to present the evidence for it, much less question its correctness. 
Common observation, confirmed by more exact study, shows the wider 
variability in height and weight at adolescence and, by analogy, the law 
appears to have been extended to mental traits without adequate 
investigation. 

This study represents an examination of the comparative variabili- 
ties as revealed in some of the most representative studies of mental and 
physical development. Those investigations were used in which the 
number of cases was large, in which the variabilities had been deter- 
mined, and in which norms for a wide range of ages were available. 

The variabilities for different ages were rendered comparable by 
determining the coefficient of variation, obtained by dividing the 
measure of central tendency (average or median) by the measure of 
variability (average deviation, standard deviation or probable error). 
This is merely finding the per cent that the variability is of the central 
tendency from which the deviations were obtained. Whatever 
measures of central tendency or variability were used by the investi- 
gator were used for our calculations. While the essential problem 
was to compare variabilities at different ages, the variability in 
different grades is practically as important and this was determined 
in several sample cases. Incidentally, also, the ratios of female to 
male variability were calculated for the light they might throw on the 
mooted question of the variability of the sexes. 


VARIABILITY IN PHYSICAL TRAITS 


The comparative variabilities in height, weight and lung capacity 
were computed from the data of Burk,! Baldwin,? and Gilbert.® 


1 Burk, F.: Growth of Children in Height and Weight, American Journal of 
Psychology. Vol. IX, 1898, pp. 253-326. 

2 Baldwin, Bird T.: Physical Growth and School Progress. Bulletin No. 10, 
U. S. Bureau of Education, 1914, p. 212. 

’Gilbert, J. A.: Researches on the Physical and Mental Development of 
School Children. Studies from the Yale Psychological Laboratory, Vol. II, pp. 40— 
100, 1894. 
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Various other physical characteristics were studied also but are not 
reported here. In the interest of economizing space and to prevent 
confusion in examining the tables, only the total number of cases is 
given and not the number of cases at each age. The number of cases 
is in all cases sufficiently large to make the coefficients reliable. Table 
I gives the results for physical traits. 





TaBLeE I.—CoEFFICIENTS OF VARIABILITY AT DIFFERENT AGES IN PHYSICAL TRAITS 

























































































Boys Girls 
Pacwerie a7 ce Ce ce ce 2a te 
isi si sirzirise S| fi; 2irF{|~— lwp 
eiei-« + | 2 | E2 2i/2lile2ls/8/8% 
o | | | >| @| BlAs| S| S| | B| BAS 
SO oO Ca) 3) o ,e 3 3) ) a) o o e] 
<|Hlelelele| slelelelele| § 
! as oe | 
| 
6.0 |.....|0.032 0.034|0.093)0. enlein 169|.....'0. wale caso. 08610.097/0. 190 
6.5 |.....|0.086... .|0.098).....|.....||.....|0.084).... ./0.087| 
7.0 |.....|0.042.0.033/0.108|0.091 0.169). . .. . 0.034,0.029.0. 1040.087\0. 206 
Soe 0 038) .../0.123).....]...../...../0. 037)... ./0.086 
BOt...4 0.040,0.0350.122|0.1120 155). _. . 0.033:0.035,0.0980.096/0. 159 
8.5 \0.043/0.034......\0.118).... |... ‘0. 046 0.031)... . .|0.098 
9.0| .. .|0.035,0.039\0 096|0.1450.196..... 0 037 (0.035 0. 10810. 116 0. 135 
9.5 |0. 0440. 039)... isin CORE COREE |0.0440. 032} ....0.113 
10.0 |..... 0.035 0.032/0. 1220. 100.0. 168||. ... . .0.028|0.038 0. 1260. 118 0. 174. 
10.5 (0.044/0.036)...../0.107|.....)..... 0.0460 032)... 10.145] 
et 0.0410. 025, 0.114/0.0940.145).. .. .(0.032/0.040(0. 1010.085)0. 125 
11.5 |0.0440.034,..... 0.115)... .: .|(0 048'0.038). ....|0.135 
12.0|...... (0.040,0.038 0. 1060.091/0.124) . ... .|0.039/0.042/0. 160)0. 13410. 142 
12.5 |0.046,0.040.....0.112).....|...../0. 0510 038)... 10.145 
13.0|..... 0.0390. 038 (0..142I0. 106 0.146}... . .0.037|0.037/0. 1450. 115)0.179 
13.5 |0.052/0.040).... ..0.142]...........|0.048.0.040...../0.121 
ey eee 0.037 0.057 0. 124|0.1710.191 .... (0.035 /0.045/0. 1230. 135 0. 165 
14.5 |0.0550.044|.....0,18).....|.... ./(0.0430.031|.... .|0.136).....|..... 
15.0|..... '0.046|0.051/0.119|0. 1400. 185). . . ..(0.032,0.034)0. 100|0. 10/0. 133 
115.5 |0.054'0.047|.....|0.109).....|..... 0.037 0. 034)... ./0.103 
16.0|..... 0.039/0.032,0.117|0.093.0.161|.. . . 0.026 0.032\0. 123|0. 103)0.140 
16.5 |0.047/0.033|... ./0.108).....|..... 0.0350.028.... .|0.091 
wa}... 0.031|0.018/0.077/0.087,0.163)|... . . 0.033 0.029/0.09210. 132/0. 156 
9.81... 0.024|...../0.068).... |.....]]..... (0.031)... .|0.083 
oa 0.032|..... om TR ON OR — ae 0.079 
| (1) Burk’s data, 88,449 cases. (4) Baldwin’s data. 
(2) Baldwin’s data, 1924 cases. (5) Gilbert’s data. 


(3) Gilbert’s data, about 1200 cases. (6) Gilbert’s data. 
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An examination of the table shows for height and weight approxi- 
mate constancy in the coefficients for boys up to the age of 124 years, 
with a sudden rise to the high point at 14 or 141% years and a decrease 
thereafter. The highest points for the three sets of measurements for 
height are at 144, 1514, and 14 years, respectively. For weight the 
high points are at 13!4and l4years. The same general tendency holds 
for girls except that with them the highest point is reached roughly 
two years earlier. In height the highest points are at 1214, 1314 and 
12 years, respectively. In weight the highest point is at 12 years in 
Baldwin’s results and in Gilbert’s data at 12 and 14 years. Both in 
height and weight, then, theory seems to hold for not only are the 
variabilities greatest at adolescence but the development is saltatory. 
Particular interest attaches to this table for height and weight are the 
only traits which we have been able to find where a large number of 
cases have been studied, in which the theory does hold, with one 
exception. The measurements of lung capacity, based on about fifty 
cases for each age, show no evidence of greater variability or salta- 
tion at adolescence. 


VARIABILITY IN MENTAL TRAITS 


Coefficients of variability in mental traits were computed for the 
data reported by Gilbert,! Pyle,? and Bickersteth.* In Gilbert’s 
‘results there are about fifty cases at each age. In Pyle’s norms the 
number of cases varies widely. Where the number involved is very 
small, the figures in Tables III and IV are enclosed in parentheses. 

Table II gives the results of the computations for the eight mental 
tests in Gilbert’s research with the averages for all the tests for the 
ages from six to seventeen years. 

Tables III, IV and V give similarly the coefficients for the data 
of Pyle and Bickersteth. 

A detailed examination of these tables shows that the period of 
greatest variability in mental traits is during the years of childhood, 
not at adolescence. The coefficients tend to decrease with fair uni- 
formity from childhood to adulthood. It is most strikingly shown in 
Pyle’s test of invention but holds almost equally well in the opposites, 


1 Op. cit. 

2 Pyle, W. H.: ‘The Mental Examination of School Children,’ New York, 1913, 
p. 70. 

3 Bickersteth, M. E.: The Application of Mental Tests to Children of Various 
Ages, British Jour. of Psychol., Vol. IX, Dec., 1917. 
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TABLE II.—CoEFFICIENTS OF VARIABILITY AT DIFFERENT AGES IN E1GHtT MENTAL 
Tests (GitBert’s Data) 


















































Boys 
Test ages | 6 |7|s | 9 | w] mu} i| a3) as 15) 16 | 17 
| | 
| 
Time memory........ 0.444 0.32410, 460\0.47110.449 0.409 0.609/0.89310.450 0.443 0. 434\0. 406 
Reaction time........ .0.163'0.172 0.155,0.222.0. 124 0. 167 0. 152.0. 163/0. 167/0. 137,0.109 0.129 
Reaction time (Disc. | | | | | 
& Choice)......... 0.099/0.180.0. 119)0. 142, 0.122.0.1500. 156.0. 142/0. 122/0.177| (0.124/0.115 
Force of suggestion... 0.391 0.35610. 300 0. 210 0.312 0. mo 237 0.241\0.306 0.318)0.312|0.480 
0 '0.412|0.431/0. ~ 0. 297) 0. 343 0.3 20/0 .333 0.424/0.348\0.355'0. 3000. 434 
Voluntary motor abil- | | | 
SST er 0.11810. 136,0.007/0.09400. 108)0. 102'0.101\0.102 0.083,0.091 0.068 
Muscle sense......... 0.400 0.333/0.377/0.431 0. 5110. 3720. 397 |0. 500|0. 576/0.355,0.400/0.433 
Sensitiveness to color | 
differences......... (0.216 0.253 0.240 0.360 0.316 0.283 0.312 0.327/0.201 0.268 0.300/0.350 
SD ow 9.5.0 ween 0.280 0.271 0. 265,0.278 0.284 0.264/0. 3857/0. 3458/0. 283 0.279/0.258/0. 302 
| Pa | eal 
Girls 
Test ages 6 | 7 | 8 | 9 | 10 | 11 | 12] 13 | 14 | 15 | 16 | 17 
, | | | | | | 
Time memory........ 0.346 0.298\/0.391,0.262 0.340/0. 493 0.498/0.542,0.574)0. 561 apr nepane 
Reaction time........ 0.183 0.165)0.119'0.192,0. 19110. 165|0. 177/0. 170/0. 160)0. 143 0.1510. 159 
Reaction time (Disc. | | | 
& Choice)......... 0.127 0.180)0. 11600. 157 0.1080. 150, 0.132)/0.133,0.152)0. 110.0. 111/0.140 
Force of suggestion... (0. 400|0. 356 0.273 0. 212 0.284 0. 288/0. 220 0.2370. 283/0.276. 0. 260, 0.387 
Nd 456 n-6:hnamien '0.328)0.331/0.304 0. 376 0.374.0.305,0.478|0.394/0. 508/0. 4960. 479) 0.318 
— motor abil- | | | 
ae 10. 127/0. 118'0.092, 0. 1160. 104 0. 108/0. 101 reagan ty 121 0.108)0. 107|0.073 
Muscle sense. . 0.309 0.333)0. 4180. 440 0. ae 500 0. 394'0.535'0.416)0.305'0.353 0.406 
Sensitiveness to ale | | | | | 
differences. . _ 0.187 0. 219'0. 328) 0.3330. 365 0.347 0.294'0.414/0.304/0. 239 0.3250. 286 
pA eee ee 0.251 0.250/0.255 0.261 0. 280 0. 294 rs 287/0. 318)0. 314|0.279 0.259\0.278 











genus-species, and part-whole tests. The general tendency is clearly 
revealed in Table VI, which summarizes in a single table the variability 
in mental traits, combining the results of Pyle, Bickersteth and a 
portion of those of Gilbert, and disregarding sex differences. 

In all of Pyle’s data, the only evidence for greater variability and 
saltation is found in the free association test. In Gilbert’s results, 
the time memory test and the fatigue test (in girls only) are the only 
ones that support the theory. The results in Gilbert’s force of sugges- 
tion test are peculiar in that the variabilities are greatest at the extreme 
age limits, six and seventeen years. 

Considerable interest attaches to the variability in a general in- 
telligence test in view of Freeman’s article in this journal on the as- 
sumptions underlying the calculation of intelligence quotients with group 
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TasBLeE V.—CoOEFFICIENTS OF VARIABILITY AT DIFFERENT AGES IN THIRTEEN 
MENTAL TEsTs 
( BICKERSTETH’s Data) 









































Girls 
| 
Test ages | 7 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 
— ; | | | | 7 
eo SP oo SR ee ee : 
Number test (1)............. | 0.575) 0.443) 0.375) 0.346) 0.398, 0.371) 0.345 0.349) 0.288 
Number test (2)............. | 0.397 0.399) 0.242) 0.308) 0.301 0.354) 0.298 0.279) 0.256 
Alphabet test (1)............. | 0.947| 0.530) 0.382) 0.438) 0.404) 0.327) 0.327, 0.305) 0.348 
Alphabet test (2)............. | 0.522) 0.383| .280/ 0.273) 0.319| 0.340] 0.327) 0.265) 0.265 
Precision and speed of move- 
TOE: Terra | 0.048) 0.052) 0.056) 0.071) 0.076) 0.004 0. 106) 0.151} 0.116 
Rate of tapping (1)...........| 0.124) 0.121] 0.084! 0.087 0.077) 0.069) 0.070; 0.099) 0.103 
Rate of tapping (2)...........| 0.309] 0.397] 0.332) 0.386] 0.572, 0.379) 0.483) 0.468 0.521 
Sustained attention test....... | 0.320) 0.235) 0.218) 0.297) 0.279 0.219| 0.218) 0.203) 0.192 
Divided attention ee 0.327) 0.372, 0.350, 0.282 0.362) 0.376, 0.360 0.398) 0.240 
TEI GEE SE NEE: | 0.396) 0.314) 0.257) 0.276) 0.309) 0.281) 0.281) 0.279) 0.269 
Memory for narrative......... | Ter rte 0.213) 0.193) 0.172; 0.126) 0.123) 0.176; 0.167 
Memory for related words..... Pac eer 0.158| 0.210) 0.187| 0.246) 0.178) 0.163 
LS eee eee eee 0.233) 0.184) 0.153) 0.213) 0.182! 0.177| 0.219 
EE FEL Ce, eee 0.212) 0.253) 0.390, 0.191) 0.186) 0.180) 0.168 
ee ea Cn Laer | 0.204 — a 7 0.167, 0.174} 0.185 
: 
( British Journal of Psychology, December, 1917—Bickerstera, M. E.) 
TaBLE VI.—SuMMARY OF VARIABILITY IN MENTAL TRAITS 
Age Number of cases | Coefficient of variability 


| 


7 408 | 0.417 
8 | 1324 | 0.356 
9 2064 0.313 
10 | 2387 0.283 
11 | 2184 0.280 
12 | 2466 0.266. 
13 | 2345 0.262. 
14 : 1906 0.253 
15 | - 0.257 
16 | 0 0.253 
17 | 606 0.266 
18 | 428 | 0.206 
SG <a sc bak abe othe an 1570 0.205 





test results. As Freeman points out, there are either or both of two 
assumptions involved in such calculations, viz., decreasing rate of 
growth in mental development or diverging lines of growth. Table 
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VII gives the coefficients of variation calculated from Mrs. Pressey’s 
data. 


TaBLE VII.—CoeEFFICIENTS OF VARIABILITY IN PRESSEY'S GrouUP INTELLIGENCE 





TEST 

Age Boys Girls 

8 0.256 0.150 

i) 0). 254 0 211 
10 0.258 0.203 
11 0.245 0.145 
12 0.168 0.132 
13 0.142 0.117 
14 0.138 0.098 
15 0.141 0.088 
16 0.081 0.080 





There is here no evidence whatsoever of increasing variability 
or saltation. On the contrary, the decrease in variability is as unmis- 
takable as it has been shown to be in practically all special mental 
traits. 


VARIABILITY IN MENTAL TRAITS BY GRADES 


Even if there is no evidence for increasing variability and saltation 
with age, it might still be urged that, after all, pupils are not classified 
in school on the basis of age and that on a classification according to 
grade, the wider range of differences at adolescence might reveal itself. 
Many such measurements have been made but only two will be 
reported here. The first are the coefficients for five of the Courtis 
tests given to 27, 171 children in the New York School Survey. Table 
VIII gives the facts. 

The second are the variabilities in Language Scale A of the Trabue 
Completion Tests for which results are available for a large number of 
cases from Grade II upward.’ Table IX gives the coefficients of 
variability for these data. 


1 Pressey, Luella W.: Sex Differences Shown by 2544 SchoolChildren. Jour. 
of Applied Psychol., Vol. I1, Dec., 1918, pp. 323-340. 


2 Trabue, M. R.: Completion Test Language Seales. Columbia Univ., Contrib. 


to Educ., 1916. 
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TaBLE VIII.—CoeEFFrIcIENTs OF VARIATION IN CourRTIS ARITHMETIC TEsTs (NEW 
York Survey Data) 



































Grade Test 1 Test 2 Test 3 Test 4 Test 5 Average 
4 | 0.166 0.299 0.232 | 0.242 0.199 0.228 
5 | 0.207 0.194 0.242 0.242 0.187 0.214 
6 0.168 0.231 0.249 0.150 0.165 0.193 
7 | 0.147 0.135 0.235 0.181 0.159 0.171 
8 | 0.109 0.111 0.123 0.203 0.145 0.138 
9 | 0.162 0.185 0.188 0.194 0.154 0.177 

10 | 0.173 0.141 0.195 0.191 0.143 0.169 
11 | 0.159 0.128 0.188 0.176 0.144 0.159 
12 0.167 0.119 0.175 0.192 0.168 0.164 
TaBLe IX 

Grade | Number of cases Coefficient of variability 

II | 1318 0.454 

III | 1437 0.380 

IV | 1463 0.290 

V | 1507 0.196 

VI 1454 0 165 

VII | 1456 0.148 

VIII 1427 0.144 

IX | 273 0.140 

x | 171 0.116 

XI 136 0.094 

XII | 103 0.103 

College graduates....... | 114 0.067 








The results show a rapid decrease up to the fifth grade and a 
gradual decrease thereafter. 


DIscussION AND CONCLUSIONS 


It is very evident that the law of increasing variability at adoles- 
cence does not hold for mental traits, so far as the groups for which 
measurements are available are concerned. On the contrary, there 
is in the school groups a marked reduction in variability at adolescence 
as contrasted with childhood. How is this reduction to be accounted 
for, particularly in view of the results of experiment on the effects of 
equal practice on individual differences, which have uniformly shown 
that differences do not decrease but rather increase when opportunities 
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for practice are equalized? In a certain sense it may be true that the 
range of differences is greater at adolescence if we include at each age 
the mentally deficient whose abilities in any test would be zero. 
School children from whom norms in mental tests are usually obtained 
are a selected group. Even so, in view of the large reduction in the 
coefficients, it is pretty certain that the average variability would not 
show an increase with age, provided a proportionate number of border- 
line and feebleminded children were tested and these results included 
in the distributions. In any case, the pedagogical inferences are 
based on the normal school population. Selection by eliminating 
those at the lower end of the distribution curve accounts, then, in part 
for the reduced variability found but not for all of it. 

Inadequacy of training causes a narrowing of the distribution at 
the upper end. It has been shown over and over again that under 
proper stimulation, a very great increase in efficiency in mental 
functions is obtainable even in those traits which in the ordinary 
circumstances of life, are much practiced. In other words, there are 
possibilities of very great increases in efficiency in the upper ranges 
which are not realized and are not revealed in the test norms actually 
obtained. The norms, for example, for the Courtis Tests are consider- 
ably lower than they would be if the stimulus of experimental conditions 
were provided.! There is then no contradiction between these findings 
and the experiments on the effects of equal practice on individual 
differences. Under ordinary conditions, the effects of equalizing 
practice is to reduce individual differences since a certain modicum 
of efficiency is all that is required. When a sufficient stimulus is 
provided, the upper limit is greatly extended and both the range and 
average variabilities are greatly increased. There is then a possibility 
- that individual differences may increase at adolescence but there is no 
evidence that they actually do. 

What we need for a final answer to the problem is repeated measure- 
ments of a great number of unselected individuals over the entire 
period of childhood and adolescence. Such data, are of course, 
nowhere to be found now. 


Sex DIFFERENCES IN VARIABILITY 
Incidentally, in connection with this study, the ratios of the vari- 


1Henmon, V. A. C.: Improvement in School Subjects Throughout the Year. 
Jour. of Educ. Research, March, 1920. 
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ability of girls to that of boys were computed. A summary of the 
results appears in Table X. The physical traits are height, weight 
and lung capacity, the data being those reported in Table I. The 
mental traits are those involved in the eleven tests by Pyle, the eight 
tests by Gilbert, and the group intelligence test by Pressey. The 
ratios for Pyle’s and Gilbert’s data were calculated for each test at 
each age and then averaged. They are not the ratios for the averages 
of the coefficients. : 

While the results in general seem to show a greater variability 
among the boys, the differences are not great, except in the Pressey 
Test, and there are markéd irregularities, notably at seventeen years. 
There are large discrepancies between the data of Pyle and Gilbert at 
eight, ten, fourteen, fifteen and sixteen years. The Pressey data show 
a remarkably greater variability among the boys, far larger than any 
other the writers have been able to discover. 























——— 





IS THE RATING OF HUMAN CHARACTER 
PRACTICABLE? 


HAROLD RUGG 
The Lincoln School of Teachers College 


AGREEMENT IN NUMERICAL RATING IS NOT AN INDEX OF AGREEMENT 
IN JUDGMENT OF CHARACTER 


I have illustrated by a striking exception the great difficulty— 
almost impossibility—of securing agreement in judging character. 
The isolation of this case will be made more evident by an accumula - 
tion of cases in which the details of man-to-man comparisons are 
reviewed. The unordered—yes, the chaotic—character of the judg- 
ments appears, irrespective of what traits are considered or of what 
kinds of scalés are compared. I now believe that the evidence estab- 
lishes the futility of obtaining single ‘‘ratings”’ on point scales of such 


/dynamic qualities as “intelligence,” ‘personal qualities,” ‘general 
‘value to the service,” “leadership,” “physical qualities,’ ‘“team- 
' work,” and the like. The cases to be presented will show: (1) scales 


similar at both extremes but widely divergent in the middle; (2) scales 
alike at the lower end but dissimilar throughout the rest of the range; 
(3) scales in fair agreement but ratings made against them in great 
disagreement; (4) scales lacking in equivalence but ratings made 
against them in close agreement; (5) exact agreements in comparing 
one man with another paralleled by large disagreement in comparing 
him with a third; etc. 

I take as the first illustration judgments of ‘‘ physical qualities”’ 
and of “‘intelligence.’”’ The scales of Nos. 37 and 38 and the ratings 
which were made against them are reproduced in Table VIII. Hereis 
an instance in which Nos. 37 and 38 used the same man at “15” and 
the same man at “3.” The scales are alike at the extreme ends. 
Furthermore, the same man who appears on No. 37’s “physical” 
scale at ‘‘12,” also appears at the same value on No. 38’s “‘leadership”’ 
and ‘“‘intelligence”’ scales. Hence, the two physical scales probably 
represent closely the same differentiation. The same five men were 
rated against the physical scales. In only one instance was there 
close agreement in judgment. Nos. 37 and 38 agree that Staker is 
about the poorest captain, physically, they have known. No. 37 
judges No. 4 to be as poor as Staker, while No. 38 rates No. 4, 6 points 


higher. Similarly, No. 38 rates No. 11, 2 points higher than does No. 
30 
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Taste VIII.—Comparison or ScALES CONSTRUCTED BY No. 37 anv No. 38 
TOGETHER WITH THEIR RATINGS ON SAME OFFICERS 






























































No. 37’s scales Ratings No. 38’s scale Ratings 
Average assigned assigned 
one Average by No. 37 by No. 38 
hoe of conf to officers to officers 
- ratings manner rated by saamtangl rated by 
others on him | Values or mame of both No. | Values or name of both No. 
scales scale scale 
officer 37 and officer 37 and 
No. 38 No. 38 
Physical Qualities 
15 | Bradley |......... 15 | Bradley 
13.5 77 12 | No. 36 Nos. 17, 12 | Eggleston 
‘27 ; 
, No. 27 
9 | Willard No. 30 9 No. 35 | Nos. 4, 30 
13.5 67 Nos. 17, 
11 
6 | Holzinge No. 11 6 No. 37 
9.0 
3 | Staker No. 4 3 | Staker 
Intelligence 
De hee -- Divcscuswe 15 | Luskin 
wood 
12 | Elwood No. 11 12 No. 36 No. 17 
9.8 77 
9.8 77 9 | No. 36 Nos. 17, 9 | Elwood Nos. 11, 
27 27 
6 | Ballinger No. 30 6 No. 35 No. 30 
12.0 67 
3 | Willard No. 4 3 | Holzinge No. 8 


























37. There is no general tendency for No. 38 to rate higher than No. 
37, however, for while they agree on Bradley at the highest end of 
their scales, No. 37 rates No. 17 four points higher than does No. 38, 
and 2 points higher for No. 27. At the same time they agree on No. 


30, each giving him 9 points. 


ratings is brought out by such examples as these. 


The topsy-turvy character of the 


The “‘intelligence”’ scales of Nos. 37 and 38 provide quite a differ- 
ent sort of comparison—namely, a case in which ratings are made 


against scales that do not represent equivalent amounts of the trait. . 
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‘See Table VIII. The instance is a clear exposition of the difficulty 


that is encountered in discriminating men who appear near the middle 
of a scale. Our tables show that there is a much larger probability 
that the 15” and the “£3” scale-men will be more adequately dis- 
criminated than the ‘6,” “9,” and “12”? men. It should be 
remembered that these scales were constructed by arranging the 
original lists in specific rank order. Thus there must be at least 
4, 5, or 6 men represented between Elwood and No. 36, who are 
reversed on the two scales, occupying the ‘‘12”’ and ‘‘9”’ positions on 
No. 37’s scale and the corresponding ‘9”’ and “12”’ positions on No. 
38’s scale. We have no means of stating the qualifications of the men 
who must have separated Elwood and No. 36 in these two original 
lists but certainly the conclusion can be drawn that there must have 
been a wide discrepancy in estimating the intelligence of the men 
rated. The ratings on them are: ’ 


RATING OF INTELLIGENCE OF 5 PERSONS 








No. 17 | No. 27 | No. 30 


Sg re 3 | 12 | 9 | 9 
Ne. Be patimee........ ....... 8 9 | 12 | g 


6 


| 
6 





Two cases occur out of five in which there is exact agreement in total 
ratings built upon scales that reverse the scale-men, against whom 
the particular jud _must have been made. At the same time 
two other men are rated 12-9, 9-12 against these very same scale- 
men, and in the case of No. 4, there is a difference of 5 points, or one- 
third of the total scale. Instability of judgment, lack of assurance 
that the score represents equivalent merit, the influence of particular 
qualities on final judgment, these conclusions and suggestions occur 
to one as a result of studying such figures. 

Before leaving this part of the discussion let us make one more 
comparison of scaling and rating ‘‘intelligence.”” Table [X supplies 
the data, together with intelligence test scores and average-ratings on 
each man. Note that No. 21 is rated one interval below No. 17 on 
No. 7’s scale, but three intervals below No. 17 on No. 22’s scale. 
Furthermore, in the construction of the scales McKinley is two inter-: 
vals superior to No. 17 on No. 7’s scales, whereas he is one interval 
inferior on No. 22’s scale—a difference of 9 points or three-fourths of 























ee ee eens 
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the total scale. Such evidence shows that it is exceedingly difficult 
to maintain even the same rank order in placing men on the scale and 
in rating others against them. Note, too, that No. 43 is given “15” 
on No. 7’s scale and “8” on No. 22’s scale. This results in some 
interesting anomalies. No. 7 judges him to be as intelligent as 
McKinley and two intervals (6 points) better than No. 17. No. 22 
judges him to be 4 points inferior in intelligence to McKinley and 7 
points inferior to No. 17. Contrasted with these dissimilarities in 
rating, note that No. 24 is rated 9 on each scale and No. 21, 6 on each 
scale. The table proves, however, that we may not deduce, from th 
fact that an officer is given exactly the same rating by two officers, that the 
rating aaa retoeey similar estimates of intelligence as contributed 
to by ‘‘man-to-man” comparison. Analysis of such cases, which I 
am certain are typical, shows that rating scales made even under such 
well-controlled conditions as were those at Camp_ Taylor, will contain 
discrepancies in placing scale-men and in estimating human trajts upon 
them, of between one and two scale interyals—that is between 25 and 50 
per cent of the total scale. i tie 





TABLE [X.—CoMPARISON OF SCALES CONSTRUCTED BY No. I AND No. 22 ToGETHER 
WITh THEIR RATINGS ON SAME OFFICERS 












































| | Ratings | _ Ratings 
hareaiiee No. 7’s scale | assigned $$ No. 22’s scale | assigned 
pes Average ; by No. 7 iby No. 22 
position | 
i. of conf. | era to officers | mere to officers 
others, | ™4ne or name | ve “ name —— 
Lr on him | Values a ee both No.7) Values | ae both No.7 
| | ies and No. | fr; and No. 
| 22 omeers 29 
_—_—- 
| 15 McKinley | No.43 | 15 | No. 17 
10.8 82.0 | | | | 
4.3 | 51.8 12 | No. 4 No.4 | 12 McKinley 
10.8 | 82.0 9 No. 17 No. 24 | 9 | No.7 Nos. 4. 11, 
| 244 
11.0 51.8 Ue) Pees er) ) PE eer! eee No. 43 
6 | Whitfield | No. 21,5 6 | No. 21 No. 21 
11 
4.5 56.0 | | 
3.0 55.5 3 | No. 28 ever 3 | No. 32 | 
6.0 63.0 | | 
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We turn next to some illustrations of ratings on a person’s general 
qualities. In making the army rating scale, the practical army officers 
insisted on adding a group of qualities called “general value to the 
service.” In addition to judging a man’s intelligence, his personal 
qualities, his physical qualities, and his ability as a leader, they wished 
to measure what he was worth to the army as an all-round man. 
| Hence, we have, in scales for “general value,” a summary evaluation 
much like that obtained by totalling the estimates of particular traits. 
/ However, there is no discernible difference in accuracy or inaccuracy 
in rating such a totality as distinguished from rating a more particular- 
ized group of qualities. 

Two sets of scales are given in Tables X and XI. The scales of 
Nos. 11 and 19 provide a very helpful comparison of scale-placement 


TABLE X.—COMPARISON OF ScALES ConsTRUCTED By No. 19 anv No. 11 To- 








GETHER WITH THEIR RATINGS ON SAME OFFICERS 









































No. 19’s scale Ratings No. 11’s scale Ratings 
assigned assigned 
Average | Average by No. 19 by No. 11 
position | of conf. to officers to officers 
onothers | ratings = — rated by a rated by 
scales | on him | Values rere — No. | Values ace both No. 
and 19 and 
officer No. 11 officer No. 11 
General Value 
40 | McKinley }......... 40 No. 17 No. 37 
30.4 82.0 
32 | Hotze No. 11 32 | No. 12 No. 12 
32.0 70.0 
No. 37 
No. 21 
30.4 82.0 24 | No. 17 No. 24 24 |Rumpel | Nos. 24, 
Staker 19 
No. 12 
No. 21 
No. 22 
22.0 55.6 aS apes) | Behar eek 16 | No.7, No. 22 
Staker 
22.0 55.6 
11.4 51.8 = = On fore 8 | No. 4 
11.4 51.8 
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TaBLE XI.—ComPpaxkiIson OF ScALES CONSTRUCTED BY No. 22 anv No. 19 To- 
GETHER WITH THEIR RATINGS ON SAME OFFICERS 











No. 22’s scale ae No. 19’s scale bison 
Average | Average by No. 22 by No. 19 
position | of conf. 
auk ti to officers Numb to officers 
a Number | rated by umber | rated b 
scales | on him or name of y 
Values | or name of | both No.| Values 1 both No. 
scale officer, 22 and one “ 22 and 
No. 19 ——— | New 


























General Value 











40 | McKinley |......... 40 | McKinley 
No. 11 
30.4 82.0 32 | No. 17 No. 11 32 | Hotze | 
No. 24 
24.0 63.0 Pe See, Riavdscnne 24 | No. 17 
30.4 82.0 
No. 24 
No. 21 
11.4 51.8 a a Seer eee 16 | No.7 
22.0 55.6 | 
14.0 56.0 8 | No. 21 No. 21 8 | No. 4 | 
11.4 51.8 | | 




















and ratings for “general value’’ because of the fact that 3 of the 5 
scale-men are the same on the two scales. Furthermore, four officers 
have been rated against these scale-men. Note that the two scales 
are equivalent at the low end but that No. 11’s scale contains No. 17 
at “‘highest,’’ whereas No. 19’s scale places No. 17 half way down the 
scale, at ‘“‘middle.”’ Here is an instance of wide disagreement (6 points) 
in the placing of one of the scale-men used, with perfect agreement in 
placing two more. The suggestion will occur that it might be caused 
by the difference in the ‘‘spread”’ of ability represented in the acquain- 
tance of the two men. It probably is not, however, for McKinley, 
No. 19’s “highest”’ man, is used by No. 11 at “‘highest”’ on intelli- 
gence; and No. 12, who appears as “‘high’”’ man on No. 11’s scale, is 
used twice as “high” man on 19’sscale. Thus two of these three men 
are known to both No. 19 and No. 11 and there is a definite tendency 
to agree on the placement of these men in other qualities. Hence, 

the lack of agreement in scaling No. 17 must be due to distinct ier 
ences in estimating the abilities of the two men. 
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Now let us compare the ratings on these scales which are obviously 
not equivalent above the 6” point. No. 19 gives No. 21, No. 12, No. 
24 and No. 22 very closely the same rating—namely, 22, 23, 24 and 20 
respectively. But, No. 11 rates No. 12 as twice as valuable to the 
service (32 points) as No. 22, who gives him 16 points! In the same 
fashion No. 11 rates No. 12 one whole interval on the scale better than 
No. 24 at the same time that No. 19 judges him to be slightly poorer 
than No. 24. | 

No. 19 rates No. 21 three points lower than No. 17, whereas No. 11 
rates No. 21 fourteen points lower. In this instance a difference in 
judgment in placing the scale-men merely accentuates the difference 
in judgment in rating on the scale. 

On the other hand, No. 37, a major, is rated “‘30” by No. 19, that is, 
7 points superior to No. 12. At the same time No. 37 is rated “40” by 
No. 11, that is, 8 points superior to No. 12, who appears on No. 11’s 
scale at ‘‘32.”’ In this case the rating on ‘“‘general value,’’ which 
differs by 10 points, represents closely the same relative judgment of 
two men who were involved in the comparison. It is clear that when 
the two scale-men at the lowest end of the scales are the same it does 
not necessarily follow that judgments made near the middle of the 
scale will be closely the same. No. 22 is rated “20” and “16” respec- 
tively by the two raters when compared with No. 7 and No. 4, who 
appear at the two lowest points on the scale. The difference in place- 
ment of No. 17 has contributed to very material differences in rating 
at the high end of the scale. Another instance of wide lack of agree- 
ment in judgment is found in the rating of Staker, a major, who is 
given ‘‘24” by No. 19 and “16” by No. 11. Furthermore, in direct 
man-to-man comparison he is rated ‘‘8,” that is, as equal to No. 7 by 
No. 11 and 8 points better than No. 7 by No. 19. If such a divergence 
appears small it should be remembered that a similar difference in rating 
on all other qualities of the scale will amount, on the average, to a 
difference of 20 points in total rating. 

Do not such illustrations! raise serious doubts concerning the valid- 
ity of ratings of human traits on point scales? They prove to me 
that the task of comparing one person’s qualities with another’s is 
fraught with so much difficulty as to be impractical in rating the rank 
and file of persons and for most practical activities of life. This 


1IT omit many other illustrative tables and scales because of lack of space 
The situation for “personal qualities” and “‘leadership”’ is precisely the same as for 
those reported. 
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study is convincing of the difference in distinguishing persons at the 
extreme ends and the middle portion of the scale. If a person stands 
out conspicuously from his group for the presence or lack of a particu- 
lar quality, it is much easier for his associates to agree in discriminating 
that quality. 

But this very fact brings to the forefront one of the most impor- 
tant characteristics of the process of judging character. That is the 
role played by conspicuous traits in dominating reactions to total 
personalities. ee 3) 


How Do WE JupceE Our FELLOWS? 


The Dominating Role of General Mental Attitudes and of Conspicuous 
Traits —With considerable hesitation I advance, at this point, a 
theory to help explain the process of judging human character. I 
shall merely outline it at this time, wishing to elaborate it more fully 
later: 

Two facts seem to be of paramount significance: first, we rate or 
judge our fellows in terms of a general mental attitude toward them; 
second, there is dominating this mental attitude toward the person- 
ality as a whole, a like mental attitude toward particular. qualities. : 
Some illustrations will supply the basis for these statements. 

The striking case of Captain X.—Take first the most objectified 
case we have, a case in which separate judgments of a person’s intelli- 
gence can be compared directly with several objective measures of his 
intelligence. Captain X was so well known and was so conspicuous in 
his group that he was used by 13 officers on 20 different subordinate 
scales—physical qualities, intelligence, leadership, etc. On each of 
these 20 scales he was elected to be “the poorest_man I ever knew.” 
Furthermore, he was so very conspicuous that three officers used 
Captain X as the ‘‘3”’ (lowest) man on four out of five of their scales. 
To them he was so outstandingly a weak man that there was no 
question of using another fellow captain for the lowest position on the 
different scales. 

Now consider the objective measures of his abilities. On three 
different psychological tests. (written group tests), Captain X was first 
ranking man among 151 officers. He scored 206 out of a possible 212 in 
the Army Alpha test. He scored 151 and 144 respectively on two 
forms of the Thorndike Alertness Test (which is Part I of his college 
entrance examination). He completed the test each time within the 
time limit of 30 minutes—29 minutes and 20 minutes respectively. 
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Moreover, he had been regarded only a few years before as an all- 
round man for he was a Rhodes Scholar at Oxford from a middle- 
western state university. At Oxford he made such a record that he 
was excused-from—eertain examinations. Here then is a startling 
example of divergence between ability-to-do and our judgment of tt. 

Now, what was the explanation? I asked, separately, 8 of those 
who used him on the scale, why they had used him at ‘3.”" Their 
comments pointed out, indubitably, that their estimates of Captain 
X’s intelligence, his physical qualities, his leadership, were dominated 
by their opinions of his personal qualities. ‘They were unanimous in 
saying that it was impossible to “‘live with him.” He wasa “rotter,”’ 
or ‘‘yellow,”’ or a “knocker,” or ‘‘conceited.’”’ The man’s personal 
qualities loomed so large in the process of judging as to play a cot 
pletely domineering role. I believe it operated in the case of these 
eight men as a definite inhibition to the process of ‘‘judging.” It is 
not possible that they really “judged” his intelligence;for example. 
They were controlled by a predisposition, a bias, a prejudice. This 
predisposition was a general mental attitude toward Captain X, 
dominated primarily by an attitude toward him as a social associate. 
This attitude had been built up by countless personal reactions on the 
drill ground, at the mess table, in quarters at rest times and the like. 
And these general mental reactions were determined very generally 
by the overpowering effect of particular kinds of responses which he 
had made. I personally believe that these reactions, furthermore, 
were determined by the way they interpreted his attitudes towards 
them. Is it not a condition of very general prevalence that we react, . 
to another in terms of how we think he will affect us and our future*. 
We ignore him or we pay close attention to him. We accept what he 
says to us or about us in terms of an attitude of confidence in how he will 
affect us. Our interpretation of the same identical remark made by a 
close friend and a hostile colleague is determined by our general feeling 
of the way he probably means it. His responses are to us symptoms of 
what he wants to have happen to us. I shall intrude more of this 
theory on the reader later on. First let us look at another illustration 
of what we are discussing. 

In Table VIII we have another typical case, that of No. 4 rated 
by No. 37 and No. 38. At two different conferences No. 38 rated 
him ‘‘9,” that is mediocre; No. 37 rated him ‘‘3” each time. No. 4, 
however, was rated by No. 37 as a “lowest”? man in each of the 5 
qualities—3, 3, 3, 3, 8, giving a total of 20, the lowest rating a man 
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can be given. Thus this is probably a case in which we do not have 
an accurate and direct comparison between two scale-men and a 
third man, for the “3”? men on the two scales are the same. In 
such a case it seems clear that there must be influencing the judgment 
a general attitude toward No. 4 that is such as to preclude careful 
analysis of his separate qualities. No. 4 is rated by the group as a 
whole as somewhat below average—the average of the conference 
ratings on him is 51. He stood out as an “average”? man in the 
psychological and alertness tests. He is a college trained man and 
advanced rapidly in salary in the three years preceding entrance into 
the service. On the whole, No.37’s rating of No. 4 can be interpreted 
as a case in which a general attitude mistaken certainly in-some par- 
ticulars, contributes to an error in judgment of an officer with respect 
to specific qualities. Sufficient evidence is not at hand concerning 
such instances for us to draw large generalizations. The suggestion 
comes insistently, however, that one of the most potent influences working 
against accurate estimates of character is the prevalence of just such 
general attitudes toward our associates and subordinates. 

It is very difficult to show the influence of a rater’s judgment of one 
set of qualities on his judgment concerning another set. The statisti- 
cal data compiled in this investigation have been carefully canvassed 
for the determination of such possible influences; naa has led to 
very little mass data that are helpful. It is believed that the only 
way in which the human aspects of this problem can be completely 
analyzed is by association during a considerable time with rating officers 
and their subordinates. My experience with the 151 officers of this 
study prohibited more than a very general comment on this matter. 

We have brought together the slight statistical evidence that has 
been found to bear upon this problem. The degree of probability 
can be stated that an officer who is assigned to a given scale value on 
one quality of the rating scale will be assigned to the same scale-value 
on another quality. The study of the detailed tables makes it clear 


that the chances are about 11 to 1 that an officer who is assigned toa - 


given scale value on one quality of the rating scale will be assigned 
either to the same scale-value on another quality or to the one above 


it or the one below it. That is the chances are about 11 to 1 that the: 


deviation in the second scale-value will not be greater than one 
interval. On the other hand, the chances vary from two to one, to 
one to two (with the qualities in question) that the officer will be 
assigned to the same identical scale-value. 
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The data presented so far not only invalidate single judgments of 
character, but they also complicate the practice of using “‘agreement 
: judgments” as a criterion of the validity of the rating scale itself. s/ 
Wwe have canvassed definite examples which have shown that identical 
ratings may be contributed to by very dissimilar judgments; likewise 


\/ that widely divergent total ratings may be based upon comparisons 





with equivalent scales that must have represented close agreement in 
judgment; furthermore, that differences in total ratings were not 
paralleled by differences in scale making and the like. We are fortu- 
nate in having the direct comparisons of judgments of a trait and the 
objective measurement of it, in the case of intelligence. The direct 
evidence is conclusive of the worthlessness of a preponderance of the 
“‘ratings.”’ Al 

But, there is another angle to this matter of subjective estimates 
of character. We have shown that with a most refined technique— 
with one so refined that it cannot be employed in general practice— 
ratings are not adequate measures of character. We need still to 
know whether this refinement-im the construction of scales and in 
making ratings improves the case for rating. 

The answer is: It does—apparently a definite amount, but yet 
not enough to suggest the general use of point rating scales. Turn 
back and compare the average differences in the official ratings with 
the average differences in the experimental ratings: 10 to 20 points 
against 6 and 7 points. A tremendous improvement was effected in 
the army ratings by sending instructors out from Washington to 
lecture to rating officers and to teach them how to make scales. 
There is no doubt that the 50 to 75 per cent reduction in variability 
of judgment was effected largely by this mass instruction. 

This has important educational implications. The marking or 
rating of teachers and students on a general point scale, without the 
aid of man-to-man comparison is closely analogous to what raters did in 
those spring and summer official ratings in 1918. And our evidence 
shows they were valueless as measures of character. 

: instruction and the refined technique in the experimental 
groups enormously improved the rating, but it was the instruction and 
the fact that raters did actually make and use scales in accordance with 
directions, that caused the improvements. It was not the added refinement 
of the experimental technique that brought about the improvement. That 
is shown clearly by a comparison of the Fort Sheridan data collected 
by Colonel Coss and our Camp Taylor data. Colonel Coss used the 
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general directions of the initial form of the scale. We used two — 


distinct refinements at Camp Taylor: first, making an original list of 
at least 25 persons, second, of ranking the original list for each quality 
separately. I cannot see that the refined technique actually improved 
the results at all. The average differences, for example, are just as 
large in the case of Taylor data as in the case of Fort Sheridan-data- 
The real nub of the matter is, I believe, that the errors in judging — 
complex traits cause variations in independent judgments so ) great) | 


as to more than offset any reductions in variability of judgirent due to 
improved technique. _— 


RATING OF CHARACTER NEARLY A CHANCE EVENT 


The examples we have studied in the past few pages reveal many of 
the attributes, indeed, of a chance situation. We should seriously 
consider, I believe, whether the making of a judgment of the character 
of our fellows does not closely approximate such conditions. I have 
considerable correlation evidence which bears s directly upon that 
thought. 

The correlation between officer’s ratings and scores made upon the 
army psychological test were computed for t5tots of 300 officers each, 
4500 officers in all. The 15 lots were taken at random from 100,000 
officers, one-third second Lieutenants, one-third first lieutenants and 
one-third captains. I assume that there is sufficient overlapping in 
the abilities under examination (ratings and performance) to lead to 
the expectation of a correlation of 0.5 to 0.6 between the two measures. 
What do we find? In each case r was less than 0.05. Most of them 
were 0.00. Obviously, the July official ratings were completely a 
matter of “chance.’”’ Apparently one might as well have numbered 
-his men and assigned ratings by drawing balls from a bag as to rate as 
was done in July, 1918. 

How much was the situation changed by the instruction and refined 
technique of the Camp Taylor experiment? The coefficients for 9 
correlation tables which we tabulated for psychological and alertness 
test scores and ratings (number of cases varied from 35 to 137) were 
respectively: 0.08, 0.08, 0.09, 0.11, 0.14, 0.15, 0.20, 0.21 and 0.23; 
average 0.15. Hence, while we did obtain a relatively better measure 
of an officer’s traits the difference was slight. A correlation of 0.15 
implies a very wide divergence from close correspondence. It is a 
very ‘“‘low”’ correlation. Several of the “experimental” correlations, 
indeed, were nearly pure chance situations. 
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How can the reliability of a rating be increased, if not by improving 
the technique of scale-making and rating? Clearly by getting many 
independent ratings on a person and averaging them. In the Camp} 
Taylor experiment we were able to do that in an exceptional way. 

Averaging the ratings on 22 officers (number of independent 
ratings on an officer varying from 3 to 13) and correlating with 
psychological test score gives for three groups: 


22 officers r = 0.48 + 0.07 
37 officers r = 0.51 + 0.09 
126 officers r = 0.36 + 0.05 


Here we have a striking example of the effect of getting many 
judgments and averaging them. The judgment of the individual, 
taken by and large is of little value. The judgment of the mass is 
close to the truth. 

(Further evidence and a summary interpretation will appear in the 
February issue.) 





THE RELIABILITY OF RANKINGS BY GROUP 
INTELLIGENCE TESTS 


DENTON L. GEYER, 
Chicago Normal College 


When a large number of persons are ranked according to their 
intelligence, will one group test of intelligence place them in about the 
same order as another? If school children are to be assigned to 
classes on the basis of intelligence, will all tests place a child in the 
same class, or will the class-section assigned to a given pupil vary with 
the test used? It is the purpose of this paper to discuss some of the 
evidence regarding these problems which may be secured by giving 
two group tests to the same pupils. 

The Otis Intelligence Test and the Illinois Examination were given 
by the same person in the junior high school grades of the Chicago 
Normal School during 1919 and 1920, and when 120 of the pupils 
were ranked on the basis of scores in the two tests—using only the 
intelligence division of the Illinois Examination—the median change 
of rank from one test to the other was found to be 18 places. The 
maximum change which could have been effected throughout the 
group was 60 places and the change left to chance would be 40 places. 
Six pupils changed rank more than 60 places; 37, or 30.8 per cent, 
less than 10 places; and 15, or 12.5 per cent, less than 5 places. The 


TaBLB I.—AmMouNT OF DISAGREEMENT BETWEEN Two INTELLIGENCE TESTS IN 
DivipiInc One Hunprep Twenty Pupits into Four Sections 
AccorRDING To ApIiLtiry MENTAL 





Comparative results from Illinois examination 























Order of intelligence Number | Number | Number | Number | Number | Number 
according to Otis test | displaced | beyond | displaced | beyond | displaced | beyond 
one sec- | middle of | twosec- | middle of | three sec- | middle of 
tion or adjacent tions or second tions third sec- 
more sections more section tion 
| 
EB. ceddiccdeat 13 il 6 4 3 | 1 
ee 19 ll 3 1 1 | 
Section C 19 11 3 | 
ac dacectwanees 11 | 5 3 2 
Totals. 62 | 38 15 7 4 | 1 











Section A contains the thirty brightest pupils as revealed by the Otis scores, Section B, the thirty 


next brightest, and so on. 
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coefficient of correlation between the two sets of scores, by the rank 
difference method, was 0.642. If these 120 pupils had been divided 
on the basis of the intelligence scores of one test into four class- 
sections of ordinary size, 51.6 per cent of them would have been in 
the wrong section according to the other test, and 31.8 per cent of 
them would have been out of place by an amount equal at least to 
half the range of such a class-section. 

The Thurstone and the Brown University tests when similarly 
given to 54 students of college freshmen grade showed a median 
change in rank of 6.7 places, as compared with a maximum possible 
change of 27 and a random change of 18. Twenty-four students 
changed rank less than 5 places, and 33 less than 10 places. Dividing 
these students into two classes in the order of their scores on one test 
would have placed 26 per cent of them in the wrong class according 
to the other test, and would have put 5.5 per cent of them out of place 
by as much as half the range of suchaclass. The correlation between 
scores is 0.74. A sophomore group of 64 students when given these 
two tests showed a median change in rank of 10.4 places, with 14 
whose change of rank was less than 5 places, and 31 whose change of 
rank was less than 10 places, but with 5 whose change of rank was 
more than 30 places. The correlation between these scores is 0.613. 
Dividing the sophomore group into two classes on the basis of the 
scores in one test would have placed 32.8 per cent of them in the wrong 
class according to the other test, and would have put 6.3 per cent of 
them out of place by at least half the range of each class so formed. 

These results may be compared with those secured by J. A. Clement 
in giving five of the group intelligence tests to 49 students in North- 
western University.! The Pearson correlations he secured were: 
Army-Thurstone, 0.60; Army-Otis, 0.57; Army-Pressey, 0.36; Army- 
Indiana, 0.36; Otis-Thurstone, 0.46; Otis-Pressey, 0.44; Otis-Indiana, 
0.34; Thurstone-Pressey, 0.25; Thurstone-Indiana, 0.25; Pressey- 
Indiana, 0.22. It is here seen that the Indiana Mental Survey test 
correlates with none of the others by as much as 0.40, and that the 
Pressey Cross Out test has no correlations as high as 0.45. None of 
the other correlations can impress us as remarkably high when we 
remember that in each comparison we are presumably considering 
two measurements of the same thing. 





1 Clement, J. A.: Use of Mental Tests as a Supplementary Method of Making 
School Adjustment in Colleges. Educational Administration and Supervision, 
November, 1920, 6, pp. 433-444. 
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That they are actually measurements of the same thing is rather 
difficult to believe when we think of the much higher correlation found 
to exist between two forms of the same test. For example, Otis! 
reports a correlation between his Forms A and B of from 0.74 to 
0.94, while Snarr,? using this material with 306 pupils found a correla- 
tion of 0.79, and Colvin’ in computing the relationship in about fifty 
schoolrooms found a relation between these forms running as high as 
0.90 and averaging 0.83. Between the two similar halves of the 
Brown University Test, Colvin‘ has also found a correlation of 0.76. 
Comparing this agreement with the wide variation cited above leads 
one to doubt somewhat that the different tests, though all called 
“‘general intelligence tests,”’ are really measuring the same element in 
the pupil’s endowment. 


SIGNIFICANCE OF THESE DIFFERENCES 


The importance of this variation among the tests probably depends 
upon the purpose for which the results are to be used. For guiding 
the pupil in the choice between two studies—a foreign language and 
manual work—the Otis scores have proved by a two-year trial in the 
junior high school named above to be of real practical value. With a 
few adjustments in cases of extreme divergence from scholarship 
records, this classification has proved workable and in general satis- 
factory. Of course, it may be said in objection that this fact does not 
show that the tests picked out pupils of intelligence but rather that 
they selected pupils of superior ‘“‘literacy’’—that this case is but one 
more bit of evidence that the so-called intelligence tests are really 
language tests. It might possibly be said further that for vocational, 
rather than educational, guidance these tests cannot be expected to 
function successfully until the vocations are classified as to the famili- 
arity with language forms which is required in each, and that even if 
the tests worked then, they would not thereby be proved to be reliable 
intelligence tests but, rather, reliable tests of literacy. Even so, 
there may be a relationship close enough between intelligence and 


1 Otis, A. S.: An Absolute Point Scale for the Group ‘Measurement of Intelli- 
gence. Journal of Educational Psychology, May, 1918, 9, pp. 237-261. 
2 Snarr, O. W.: Reliability of General Intelligence Tests in Classifying High 
School Pupils. Unpublished master’s thesis, University of Chicago, June, 1919. 
3 Colvin, S. S.: Some Recent Results Obtained from the Otis Group Intelli- 
gence scale. Journal of Educational Research, January, 1921, 2, pp. 1-12. 
* Colvin: Educational Tests at Brown University. School and Society, 10, p. 27. 
































46 The Journal of Educational Psychology 


linguistic ability to allow us in many situations to consider these 
scores as real, even if indirect, indices of intelligence. There is no 
lack of evidence that persons selected by tests very similar to those 
under discussion were found by trial to be the persons most proficient 
in types of work making little use of written language or other symbols. 
The Army Intelligence Tests selected men in a way that corresponded 
very closely with the selection made on the basis of general military 
value by officers knowing the men well. For example,! in twelve 
companies the average correlation between rankings by intelligence 
tests and rankings by officers on the basis of soldier value was 0.536, 
and in seven of the twelve companies it ranged from 0.64 to 0.75. 
A great deal of evidence of this kind could be cited from the records 
of the army psychologists. It seems in this connection to lead toward 
the conclusion that, in consideration of the comparatively small 
amount of use which the common soldier makes of written symbols, the 
Army Tests were of a truth measuring some quality other than 
literacy which was valuable in practical life situations; and that, in 
consideration of the fact that the correlations would always be kept 
low by the number of qualities besides intelligence which make for 
military efficiency and of the further fact that there is no other quality 
which the tests from their construction could reasonably be supposed 
to be measuring, the Army Tests were to a large degree genuine 
measurements of intelligence. Now since one of the tests used in the 
above schooi experiment served as a principal basis for the Army 
Test,? it seems not improbable that it, too, measures intelligence with 
sufficient accuracy to be of frequent practical value, especially in 
situations where the discriminations demanded are not too fine. 

If group intelligence scores were to be used for classifying pupils 
into small groups of homogeneous ability, we could apparently expect 
a great many mistakes, but the real significance of this would depend 
upon how far a given pupil is out of place, how serious for the purpose 
in hand is such a displacement and, in a practical sense, upon how 
much better even such a classification is than the hit or miss grouping 
which usually prevails. Asa matter of fact, nothing is commoner in 
educational literature at present than favorable and even enthusiastic 
reports of experiments in classification on the basis of scores in some 
group intelligence test. Disregarding the possibility that where the 


1 Yoakum and Yerkes: ‘Army Mental Tests,”’ p. 30. 
2 Yoakum and Yerkes: ‘Army Mental Tests,” p. 2. 
3 Jordan, R. H.: An Example of Classification by Group Tests, Educational 
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plan fails the experiment is not written up, this would seem to show 
that great accuracy in ranking the pupils is not essential, at any rate 
for a most noticeable improvement over present practice. 

Advising students away from such abstract studies as algebra or 
Latin and into such concrete activities as those of the commercial 
course is a use to which the intelligence tests have been successfully 
put.' As noted above, the possibility that they test only that type of 
intelligence which works through symbols does not stand against 
them here, for that sort of ability is, of course, just what we want to 
find in this case. Deciding whether a student can carry extra courses 
without overworking? is also a use of the tests with which their alleged 
symbolic character will not greatly interfere. 

Comparing intelligence scores with scholarship in the Normal 
School shows correlations as follows: in the college freshmen group, 
Thurstone scores with semester grades, 0.41; Brown scores with 
semester grades, 0.56. For 14 students the rank in scholarship differed 
from the rank in the Brown University Test by less than 5 places; for 
32 by less than 10 places; and the median difference in rank is 8.3. 
For 12 students the rank in the Thurstone test differed from the rank 
in scholarship by less than 5 places; for 24 by less than 10 places; and 
the median difference in rank is 11.5 places. In the junior high school 
the correlations are: Illinois Intelligence scores with school marks, 
0.25; Otis scores with marks, 0.32. But since school marks depend on 
many things besides intelligence (industry, attitude, home conditions, 
etc.), these low correlations can hardly be taken as seriously calling 
into question the validity of the test results. 

Though failure to confirm teachers’ marks is but an indifferent 
criticism of intelligence tests, failure of one test to confirm the findings 
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TaBLe IJ.—DiIstTripuTion or Pupits 1n Two Group INTELLIGENCE TESTS 


A—120 Junior High School Pupils 





Scores in Intelligence Division of Illinois Examination 





























Scores in eet: Pee Meee 8S ee! 
Otis Test | | | ¥ 
Below 68 | 68-81 | 82-95 th 96-109 | 110+ | Totals 
es | 2 Rae We ae < ; = 
| | 
135+ 1 2 - 4 ot of 
115-134 > 4 9 | 8 7 | 28 
95-114 2 11 21 | 4 | 3 | 41 
75-94 7 15 10 | Ss | 4. | 34 
Below 75 3 3 mm 7 
Totals........ 12 34 43 | 18 | 13 | 120 
7 __| | | | | 
B—54 Normal College Freshmen 
a Scores in Brown University Test 
Scores in Thurstone oe, eee Le Sa 
Test ' | | 
Below 35 35-44 | 45-54] 55-64) 65+ Totals 
| 
120+ | 2.4 3 5 
100-119 Eats ie 4 1 | 6 
80-99 | ss . |... 8 18 
60-79 | | 8 | 8 a re. 20 
Below 60 - © 2 2 5 
Totals. . | 2 | 12 18 17 5 | 54 
k | | 


Scores in Thurstone | 


Test 


120+ 
100-119 
80-99 
60-79 
Below 60 








C—64 Normal College Sophomores 


Scores in Brown University Test 








50-59 | 


60-69 | 70-79 





a coos Abie ak 


40-49 80-89 | Totals 
| 
lac ctaeacaltindesinsadaiasictgaetiit aie wicks aie acre 

es ‘- 3 1 4 

2 2 3 3 10 

6 14 12 - 32 

10 4 1 ms 15 

2 1 3 

2 19 20 19 4 64 























nce, — 














A a TH 








Group Tests 49 


of another is more important. The extent to which rankings by a 
given group test varied from rankings by another, in the data cited 
above, would seem to show that, as indicated in the recent symposium 
in this journal, there is much yet to be done before group intelligence 


tests can be very fully relied upon for trustworthy placing of individual 
pupils. 
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THE DEVIOUS PATH OF SLOW WORK 


GRACE E. BIRD 
R. I. College of Education and R. I. State College 


Only recently have teachers come to realize that accuracy is not 
conditioned by slow work. A class experiment in adding made by 
Thorndike! and extending through several years indicates that a 
very close relationship exists between rapidity and accuracy. Among 
six-hundred seventy-one students variations were considerable, but 
the quickest sixty-five averaged one hundred additions per one hundred 
seconds. The slowest averaged only one-fourth as many. The sixty- 
five individuals who added the most rapidly made seven errors per 
thousand additions. The twenty who were slowest made an average 
of seventeen and one-half errors. Similar relationship is shown 
throughout the intermediate speed groups, and is permanently 
characteristic there also. 

Through practice one hits upon short cuts or ‘‘ kinks” as they are 
called by the industrial worker, thereby eliminating superfluous 
motions and varying factors,—hence the improvement in speed that 
comes through practice. According to Gilbreth,? however, fast 
motions are different in character from slow motions. The learner, 
therefore, should be encouraged to attain standard speed of motions 
as early as possible. If these motions are such as cannot be made by 
the beginner at standard speed, rapidity should approach as nearly as 
possible that used by the expert. Otherwise the habit may be 
initiated incorrectly. Also, the worker in seeking speed later may 
find that the different motions may cause retroactive inhibition, as in 
other interfering habits, not well-automatized. Jesperson, the Danish 
philologist found the rate for optimum initial speed in teaching 
languages, also to agree with these conclusions. In industrial practice 
the learner may be encouraged to approximate standard speed by 
giving him work in which the finest quality is not essential. Eventu- 
ally, accuracy of method and speed occur simultaneously with good 
quality. In other words, if the method and the speed are taken care 
of, the quality will take care of itself. 

By standard speed is meant not always high speed, but that rate 
of speed which will produce the best results efficiently. Undue haste 


1 Thorndike, E. L.: Relation Between Speed and Accuracy in Addition. 
Jour. Ed. Psych., Vol. 5. 
? Gilbreth, F. B. & L. M.: ‘‘ Applied Motion Study,” 1917, Chap. VI. 
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is apt to arouse such emotions as anxiety, fear, or annoyance which 
invariably tend to interfere with rational processes. If, however, the 
child’s work in arithmetic or any other subject requiring both speed 
and accuracy be properly focalized and motivated through play 
stimuli so that it will seem worth his while to exercise optimum effort, 
he may attain speed very early in the learning process. In arithmetic, 
approximations of the answer rather than finding the exact result give 
him an opportunity to attain initial speed in the same way that the 
industrial beginner may approach standard speed if given work in 
which the finest quality of workmanship is not essential. . 

Gilbreth, in learning to lay bricks, observed that his teacher 
employed three sets of motions to do the same thing. One was the 
demonstrating set used for teaching, the other two were employed in 
his own work, one being slow and the other fast. He used different 
motions when working slowly than when working rapidly because of 
the different muscle tension involved. In the latter instance cen- 
trifugal force, inertia, momentum, combination of motions, and play 
for position functioned favorably. When there was no emphasis on 
speed he was differently affected by these variables. 

In mental processes, also, there is a difference between rapid adjust- 
ment and slow adjustment. The distinction may be realized by the 
most casual introspection. Although adding is a familiar process it 
is very complex. In order to add eight and nine on paper, for example, 
the individual first perceives visually the number eight, at the same 
time perhaps experiencing one or more images involving associations 
depending upon his apperceptive background. This process is re- 
peated for the number nine and for the product seventeen. Further- 
more the product may be almost subconsciously resolved into other 
element combinations such as ten and seven, five more than a dozen, 
etc. The act of writing the number may attract the writer’s attention 
to that motor performance with its own complexelements. Thelonger 
one delays the completion of the act the larger the number of ‘“‘irrele- 
vant bonds” realized. In slow addition a person may even revert to 
wasteful habits of childhood such as counting on the fingers, lip move- 
ment, vocalization, etc. In rapid calculation learned through properly 
focalized practice, such irrelevant matters are crowded out through 
the exercise of inhibitory processes. The first perception of the 
numbers set off the automatic response of the product, with the 
elimination of useless and wasteful intermediate performances. 

Recently one hundred college students were tested by the writer 
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with slow and rapid adding of examples taken from the Courtis 
research tests. For two minutes the students were required to work 
as quickly as possible. The median number of errors was found to be 
three, the quartile deviation 0.5. The students were then asked to 
continue adding. This time they were cautioned to work slowly and 
accurately. The median number of errors was four, the quartile 
deviation 0.8. The workers were then requested to describe every- 
thing that entered their minds during the rapid adding. Only five 
individuals recorded conscious distractions of any kind. The others 
stated as their central thought a desire to get the answer, or to add as 
rapidly as they were supposed to. When required to record their 
thoughts as experienced during slow adding all but three mentioned 
distractions. These included variety of imagery, adding by combining 
units rather than by combining groups, consciously unnecessary 
repetitions of sums obtained in the process of adding a column, emo- 
tional disturbances, physical uneasiness, observation of environmental 
stimuli, halting uncertainties regarding the sum of certain numbers, 
forgetfulness of the sum already found, losing the place, slight amuse- 
ment at the experiment, and fatigue. 

If it were possible to draw accurate motion paths of these distrac- 
tions the result would be a tangled skein as intricate as the motions 
of the slow industrial worker. If this vagrancy of attention occurs in 
individuals who have learned to add well enough to enjoy their skill, 
it should be even more evident in the case of the child who in the 
process of learning to add is only too ready to be diverted by outside 
stimuli from a difficult and irksome task in the stage when it is neither 
novel, nor yet pleasantly automatic. Continual shifts of attention to 
distractions might easily occasion the fatigue experienced by some of 
the individuals during the writer’s experiment in addition. Further 
investigation might show decreased efficiency even more marked than 
the reduction of accuracy from a median of three errors to a median of 
four. The larger percentage of errors during slow adding and the 
variety of irrevelant mental content indicate that in some way the 
nature of the work is different from that of rapid adding. 

In reading, also, if the by-paths of articulation, inner speech, eye 
and throat tensions, auditory, motorizing mechanisms, and imagery 
of the slow reader could be reproduced and compared with the direct 
route of the rapid reader, the relationship would no doubt parallel the 
comparison between slow and rapid adding. 

In J. A. O’Brien’s! experiment, photographic records were made 
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of the eye movements of ten pupils in grades III to VIII before and 
after training in silent reading. A study of the records showed that 
the improvement on the physiological side was effected chiefly by a 


lessening of the number of the fixation pauses rather than a decrease 


in the duration of these pauses. The development of speed was also 
accompanied by a marked decrease in the number of regressive move- 
ments and by the setting up of habits of regular rhythmical eye- 
movement. This adds evidence to the assumption that slow work is 
of a different character from quick work. 

As has already been pointed out by M. A. Burgess,! scales for the 
comparative attainment in reading measure quality, difficulty, or 
amount, though reading is not easily measured by scales for quality 
or scales for difficulty. It is measurable by scales for amount. It 
is probable that difficulty will be indirectly measured eventually 
through a series of carefully-graded tests for amount, thereby following 
the law of the single variable as recognized in scientific measurement. 
This single variable (amount) obviously involves speed. 

In a previous experiment? by the writer in giving standard tests 
to a whole school the highest correlations between tests occurred 
between comprehension and speed in Kansas Silent Reading and 
between speed and accuracy in Courtis arithmetic. In handwriting, 
however, a minus correlation was found between speed and legibility 
probably because the children had been trained to write slowly, and 
were therefore disturbed by the effort to inhibit superfluous motions. 
Rapid drill from the beginning focalizes and initiates habit with a 
minimum of waste. 

“L’éxercice abrége le calcul, parce qu’il modifie le travail, non 
seulement au point de vue quantitatif, en accroissant la vitesse 
d’éxécution des opérations élémentaires et la vitesse de transition 
d’une opération a4 l’autre, mais et surtout au point de vue qualitatif, 
c’est & dire en transformant la nature du travail.’’® 

Pupils should think in terms of results more than in terms of the 
process. This economical method encourages speed and is more con- 
ducive to concentration because in less danger of distraction elements 
which tend to alter the character of the work. 

Conclusion.—Fast motions are essentially different from slow 
motions not only in industrial but in intellectual work. 


1 Twentieth Yearbook, Nat. Soc. for the St. of Ed., Pt. II. 

? Bird, Grace E.: A Test of Some Standary Test. Jour. Ed. Psych., Vol. I, 
No. 5. 

’ Foucault, M.: L’Etude Scientifique du Travail Mental Specialement Dans 
le Travail d’Addition. L’ Année Psychologique, Tome XX, p. 125. 
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CONSTANCY OF THE STANFORD BINET-IQ AS 
SHOWN BY RETESTS 


JOHN L. STENQUIST 
Bureau of Reference, Research and Statistics Board of Education, New York City 


In the September 1921 issue of this Journal appeared a summary of 
six reports! on the above topic including a study reported to have been 
made by the writer, and another by Miss Fermon. While mention 
was made of the fact in a footnote, it should be made clearer that the 
same cases are involved in both reports but the data were treated in a 
somewhat different way in each case. In this article the “‘ conclusions 
ignore the data of Stenquist and Fermon which—must be unsound,” 
in view of the contradictory results reported by the other workers. 
We are anxious to be the first to express our gratification, at the higher 
constancy found by other investigators. In fact, it was precisely 
because of the disappointingly low constancy found by us that the 
complete report has been withheld from publication in the hope that, 
other, and more encouraging ones would appear. We yield to none 
in our insistence upon the importance of proper standards of qualifi- 
cation for mental testers, but we do not feel there is necessarily final 
ground for admitting the unsoundness of our data. Frankly, however, 
we hope they are unsound. We fully agree that the other reports do 
strongly tend to cast doubt upon the validity of our results, and naturally 
the five reports summarized in the article referred to are therefore of 
particular interest to us. Our tests were given by four persons, and 
errors made by any of these may of course be responsible. Their 
training and experience was as follows: 

One, a Smith College graduate and graduate student at New York 
University, has acted as examiner for the Public Education Association 
for several years, giving hundreds of Binet tests, and hence her pro- 
ficiency was unquestioned. 

The second examiner is a Vassar graduate where she had substan- 
tial psychological training. At least 20 Binet tests were given there 
by her under close supervision. Following this she had the experience 
of testing between 50 and 60 cases in a psychological clinic in New York 
City. After this she gave approximately 40 Binet tests in a survey 


1 Rugg, Harold and Colloton, Cecile: Constancy of the Stanford-Binet as 
Shown by Retests. Journal of Educational Psycholgy, September, 1921, pp. 315- 
322. 
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by the Department of Ungraded Classes in New York City. All this 
experience plus a thorough psychological training should make her more 
proficient than many examiners. 

The third examiner is also a Vassar graduate, where she had had 3 
years of work in various branches of psychology, and at the time the 
present study was conducted she was taking a graduate course in 
psychology at Columbia University. In connection with the course 
in applied psychology at Vassar College she had given about 25 Binet 
tests, during a period of 9 months. The first of these tests were given 
in the presence of the instructor and the results in the remainder were 
checked by the instructor, in so far as that is possible. 

The fourth examiner, also a graduate student, had given at least 
200 tests prior to this experiment and had had thorough college and 
clinical training. : 

Whether or not our examiners were competent can only be inferred. 
That our larger differences may be due to the foreign character of the 
population tested seems most likely, however, as in our group the lan- 
guage factor was a serious one. If a pupil who lives in a home where 
English is not spoken is tested at the beginning of school, say at age 6 
to 7—and then retested after a period of 6 months to 18 months in 
school where the English language is acquired, it is reasonable to 
suppose that this knowledge of English will improve his score appreci- 
ably—as much as the improvement shown in our retests. 

Thus while on the whole we too would prefer to assume that in some 
way the technique of our examiners differed sufficiently to explain the 
differences, rather than to destroy our confidence in the fairly high 
average constancy of the Stanford-Binet test, the language-difficulty 
factor alone seems adequate to explain our higher retest scores. Even 
with the assumption that the Stenquist-Fermon data are unsound, 
however, there still remain some troublesome points in the matter of 
the constancy of anIQ. Leaving our data entirely out of consideration 
for the moment we may still note the wide range—from —20 IQ to 
over 20 IQ in the Terman data, from —15 IQ to 17 IQ in the Rugg- 
Colloton data, and from —14 IQ to 15 IQ in the cases of Garrison. 
Does this not mean, that when we cite the case of a pupil tested within 
say, 6 months to 18 months, the IQ assigned to him may be wrong by as 
much as 20 or more points? To be sure it is chiefly a question of how 
often this will occur, but the disturbing fact is that this can and does 
occur at all. Even if we limit it to the large error of, say, ‘not more 
than 15 points wrong,’ it still occurs too frequently for comfort. The 
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percentages of cases which differed 15 points or over as shown in the 
article referred to are: 


For Terman’s data: in 29 out of 435, or in about 7 per cent of the 
cases. 


For the Rugg-Colloton data: In 6 out of 137, or in about 4 per cent 
of the cases. 


For Garrison’s data: In 1 out of 62, or in about 2 per cent of the 
cases. 

In our data this percentage rises to 11 per cent, which in the light of 
the other data seems too high. But whether it is 2 or 7 or 11 children 
in a hundred, in whose cases we make this huge blunder, it is serious. 
Assuming adequate proficiency of all testers the imperfect reliability 
of our scales of course also contributes to the unreliability of our con- 
stancy figures. That the Intelligence Quotient is very closely constant 
for each child seems doubtful in view of these wide ranges, and the 
relatively high reliability of Binet test, no matter what may be the 
case “‘on the average.”’ In the Stenquist-Fermon data if we eliminate 
the 26 children who differed by 20 or more, the distribution is not 
markedly different from that of Terman’s data. It is these 26 cases! 
that look the most questionable. We shall await with much interest 
the findings of other workers. 





1 Are these 26 cases those having language difficulties? This should be ascer- 
tained. H.O.R. 
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INTELLIGENCE TESTS 


The Results of Repeated Mental Re-examinations of 639 Feeble-minded Over a 
Period of Ten Years. F. Kuhlmann. Journal of Applied Psychology, 1921, 


September, 221-224. A study of mental age growth curves and the constancy of ' 


the IQ for the four groups of the feeble-minded, each group being studied separately. 
Complete data given. 

The Intelligence of Chinese Children in San Francisco and Vicinity. Kwok 
Tsuen Yeung. The Journal of Applied Psychology, 1921, September, 267-274. 
Results of testing 109 Chinese children with the Stanford-Binet Test. Details are 
given in eight tables. Comparison made with Terman’s data on American 
children. 

A Comparison of Brahman and Panchama Children in South India with Each 
Other and with American Children by Means of the Goddard Form Board. D. S. 
Herrick. Journal of Applied Psychology, 1921, September, 253-260. Racial 
differences in general intelligence. Comparison of the results of tests given to 355 
high caste Indian Children, 355 low caste, and 1572 American children. 

Pictorial Completion Test II. Wm. Healy. Journal of Applied Psychology, 
1921, September, 225-239. The picture completion test as the fairest test of 
appercefptive abilities. Description of test; directions for giving and scoring test; 
and norms of performance. 

A Cycle Omnibus Intelligence Test for College Students. L.L. Thurstone. Jour- 
nal of Educational Research, 1921, November, 265-278. Description of the selec- 
tion and cycle arrangement of six tests to be given to college freshmen. Norms 
of performance for the freshmen of a number of engineering and liberal arts, 
colleges and normal schools. 

The Case for the Low IQ. J. L. Stenquist. Journal of Educational Research, 
1921, November, 241-254. Criticism of the narrow, academic nature of present- 
day intelligence tests. Discussion of other kinds of “general’’ intelligence illus- 
trated by tests of mechanical ability. 

Where Test Scores and Teachers’ Marks Disagree. Mary B. Lindsay and Ruth 
S. Gamsby. The School Review, 1921, November, 678-687. Special studies of 
46 cases showing a wide difference between the score on Terman group test and the 
average of teachers’ estimates of work of each student in each subject. Binet 
test used to confirm group test score. Explanation for divergence given in each 
case. 
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The Grading and Promotion of Pupils. Chas. B. Willis. Journal of Educational 
Method, 1921, November, 90-95. The value of mental measurement» follow-up 
work; what has been accomplished in the Alexander Taylor School of Edmonton, 
Alberta, Canada. 

EpvucaTIonaL TESTS 


A First Report on Two Diagnostic Tests in Silent Readirg for Grades II to. IV. 
Luella C. Pressey. Elementary School Journal, 1921, November 204-211. An 
analysis of the silent reading problem in the lower grades followed by a description 
of two tests, one for speed and the other for vocabulary, to be used as diagnostic 
tests. Information also given to show how the tests were validated and to illus- 
trate the practical use of the tests and the interpretation of results. 

Comparative Scoring and Recording of Educational Tests. E.E.Liadsay. Edu- 
cational Administration and Supervision, 1921, November, 427-432. Description 
of a percentage system of translating scores of standardized educational tests. 
Diagnostic possibilities of the system illustrated by actual cases. 

The Measurement of High School English, Edward Wm. Dolch, Jr. Journal 
of Educational Research, 1921, November, 279-286. A defense for the amount of 
time given to high school English. Why the results of English teaching cannot be 
adequately measured. 

Measuring the Efficiency of Teachers by Standardized Tests. Samuel Brooks. 
Journal of Educational Research, 1921, November, 255-264. Rating the teacher 
according to progress made by pupils as measured by standardized tests. Illus- 
trations of the practical working of the plan. 


TrestTs FOR SPECIAL ABILITIES 


The Construction of Tests for Discovery of Vocational Fitness. Frank Watts. 
Journal of Applied Psychology, 1921, September, 240-252. A classification and 
discussion of the tests already in use. Guiding principles in the construction of 
_ such tests. 

Methods for the Selection of Comptometer Operators and Stenographers. M. A. 
Bills. Journal of Applied Psychology, 1921, September, 275-283. Report of a 
study made with certain tests of the Bureau of Personnel. Research of Carnegie 
Institute of Technology, to determine whether the tests would (1) eliminate 
failures, and (2) select sure successes. Satisfactory results given in detail. 


MISCELLANEOUS 


Three Refinements of Method in School Surveys. Florentino Cayco and Sidney L. 
Pressey. Educational Administration and Supervision, 1921, November, 433-438. 
Report of a survey of Grades 1, 2, and 3 in three ward schools. Educational 
efficiency shown best by ‘‘ability grade table,’”’ evenness of development in all 
subjects, and correlation between ability and achievement in individual cases. 

The Relative Standing of Mathematical and Non-mathematical Pupils. John A. 
Marsh. Educational Administration and Supervision, 1921, November, 458-466. 
Results of a study of 115 pupils in the Boy’s English High School, Boston. Two 
groups—one studying no mathematics, the other studying mathematics in the first 
year. Groups almost exactly the same in first year work. Mathematical group 
decidedly superior in work of second and third years. 
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Mind-set and Learning. William H. Kilpatrick. Journal of Educational 
Method, 1921, November, 95-102. Part I of a popular presentation of the laws of 
learning. 

Filmed Geometry. Charles H. Sampson. Journal of Educational Method, 
November, 1921, 116-117. The place of the educational film in the class-room and 
especially in evening schools. 

Mertal Types, Truancy, and Delinquency. Edgar A. Doll. School and Society, 
1921, November 26, 482-485. Truancy and consequent delinquency in large part 
the fault of the public school system. Need for a scientific classification of children 
according to individual differences in mental type, and differentiated courses of 
study. 

Investigations Undertaken by the Society for Experimental Pedagogy in Denmark. 
Christian Hansen Tybjerg. Journal of Educational Research, 1921, November, 
301-307. Brief mention of a number of investigations, physical and psychological, 
conducted by the Society for Experimental Pedagogy, with the results of each. 


Miror Studies in Educational Psychology. Francis Gaw. Journal of Applied 


Psychology, 1921, September, 284-286. 
1. School Ratings and Moving Pictures. Influence of movies on the conduct 
and school ratings of 337 children in a suburb of Boston. Practically no relation. 
2. Relation of Stanford Tests and Dearborn Maze Tests. Correlations between 
Stanford-Binet Scores and Dearborn scores of 77 patients at the Boston Psycho- 
pathic Hospital grouped according to diagnosis. Comparison of Dearborn scores 
of 77 patients and 36 normal adults connected with the hospital. 











ie TR ys SE Hie 9” a ee ll Br om 


- . iJ 
Pe Se 2 - a 
~ ee es nena oe 
eee 














NEW PUBLICATIONS IN EDUCATIONAL 
PSYCHOLOGY AND RELATED FIELDS OF 


ms EDUCATION Sm 











1. A new and important text in General Psychology of particular 
interest to educational psychologists is the new text by Woodworth.' 

The usual text in psychology has seemed to many psychologists 
working in the field of education to offer comparatively little which 
could be applied to the solution of educational problems. There 
were, of course, two interpretations of this fact; one, that general 
psychology was by nature not susceptible of direct application and the 
other, that the type of psychology ordinarily represented in general 
texts and courses was not of the character which could readily be 
applied. Woodworth’s text demonstrates that the second explanation 
is more nearly the correct one. It represents distinctly a type of treat- 
ment which, without much direct discussion of educational problems, 
illumines the processes which are involved in learning and in teaching. 
This will be clear from a description of the book. 

The general plan of the book is as follows: After defining and 
delimiting the subject in a simple and clear fashion, the author opens 
with a discussion of reactions. He begins with the simplest reactions, 
the reflexes, and proceeds to a discussion of the more com- 
plex ones, endeavoring to avoid a break in the continuity of the discus- 
sion. The different levels of reactions are discussed with particular 
reference to the organization of the nervous system. The nervous 
system, however, is treated not as a separate topic but simply as a link 
in the chain of the explanation of reactions. Neurological explana- 
tions, moreover, are included at any place in the book where they are 
called for. In this way the whole treatment is permeated by a 
reference to the nervous basis of mental life. 

Transition from the simpler reactions to the higher and more com- 
plex ones is made through the development of the concept of tendencies. 
These are the relatively permanent dispositions of the organism which 
bring about what are sometimes called indirect responses. They 








1 Woodworth, Robert S.: ‘‘Psychology, A Study of Mental Life,’’ New York, 
Henry Holt & Company, 1921, p. 580. 
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account for such features of mental life as motives, without abandoning 
the fundamental notions of reaction. 

The transition to a descriptive account of the various types of 
native reactions is furnished by a discussion of the relation between 
native and acquired responses. The native reactions themselves are 
classified under the heads of instinct, emotion, feelings, sensations, 
attention and intelligence. It will be seen that the order is, in a number 
of instances, the reverse of the usual one; thus emotion comes before 
feeling and sensation and attention after a prolonged discussion of 
activities in which they are involved. This accords wih the general 
mode of treatment by which the total reaction is first described as a 
whole and then analysed into its elementary processes. Other illustra- 
tions of a similar reversal of the usual order are found in the placing of 


perception after learning, association after memory, and imagination — 


after reasoning. The same reason holds here as in the previous case. 

The chapter on Intelligence closes the treatment of native responses 
and forms the transition to the description of acquired responses. 
As the first phase of the discussion opens with native reactions in the 
form of reflexes, so the second phase opens with acquired reactions in 
the form of learning. This is followed by memory including the account 
of the process of memorizing in some detail, of association, perception, 
reasoning, imagination, will and personality. 

This plan reveals the general character of the book. It is beha- 
viouristic, with a small “‘b.’’ Mental life is conceived as a form of 
activity organically related to bodily activity, and not as a passive 
spectator on the scene of life. The author refuses to follow the extrem- 
ists of the behaviouristic school, however, but makes reasonable use 
of the method of introspection and ascribes due importance to the 
sensory, perceptual and imaginational processes. These processes, 
however, are not independent elements but are functional parts of 
reactions. The inclusion of tendencies saves the discussion from an 
undue emphasis upon the simple animal-like type of reactions. 

The content of the book is comprehensive. It includes the some- 
what novel topics of learning, memorizing and intelligence, besides the 
usual ones. These, of course, fit very naturally into the general plan 
of the book and constitute part of the reason why it will prove particu- 
larly useful to educational psychology. 

The whole discussion as well as the general plan is thoroughly 
matured, well-organized, systematic and consistent. There is nothing 
improvised about the book. The author uses the results of scientific 
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studies ranging over the whole field but presents them in a thoroughly 
assimilated form. The student is given the conclusions from these 
scientific studies without being confused with detailed debates on 
matters of theory. The chief issues, however, are presented in clear 
and simple fashion. 

The style of the book is simple, direct and, in places, colloquial. 
This will perhaps be an attractive feature to the undergraduate stu- 
dent. The concessions are sometimes considerable, as in the sentences 
“There are lots of nerve cells,’ ‘‘ Not that Freud would OK our account 
of dreams up to this point.”’ In places the style becomes picturesque, 
reminding one of James: ‘‘Man is by all odds the most pottering, 
hem-and-hawing of animals.’”’ It seems likely that the book will in 
some measure, at least, counteract the tradition that the study of 
psychology is a very formidable and abstract affair. 

It will be gratifying to many psychologists to find the author, 
while giving due credit to the contributions made by his research, 
refusing to accept the extravagances of Freud’s theory. His sane and 
comprehensive statement of the limits of his theory should have large 
influence. This is but an instance of the balance and sanity of the 
entire book. 

FRANK N. FREEMAN. 


2. The Second Volume on the Virginia Survey.—Part II of this 
survey report! deals with educational tests. The purpose and scope 
of the measurement program are outlined in the opening pages. Local 
conditions necessitated a careful adaptation of test materials and 
standards if the survey was to accomplish its twofold purpose: (1) 
To present such evidence of the status of the schools as might lead to 
necessary action for improvement by constituted authorities. (2) 
To disseminate information, stimulate interest and develop under- 
standing of the best educational methods to make for a permanent 
local force for the improvement of education. 

The difficulties of administering the state-wide testing movement 
were surmounted by effective organization under the leadership of Dr. 
Haggerty and by the assistance of the General Education Board. 





1Hart, Harris, President of the Virginia Education Commission and Inglis, 
Alexander J., Director of The Virginia Survey Staff: ‘Virginia Public Schools: A 
Survey of a Southern State Public School System.” ‘‘Part [I—Educational 
Tests.”’ ‘Educational Survey Series.” Yonkers: World Book Company, 1921, 
pp. 235. 
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About 16,000 different children were examined with from six to 
forty tests each. About 5,000 were in grades III to VII of rural white 
schools. Another thousand were in grades I and II of the same 
schools. About 6,000 white children were in the seven grades of 
city schools, and in the first year of high school. About 3,000 colored 
children were examined. Great care was exercised in the selection of 
schools to be tested, and in.the selection and training of prospective 
examiners. The scoring was done by specially trained and supervised 
advanced and graduate students and carefully checked by the survey 
staff. 

While Dr. Haggerty is responsible for the general plan of the 
reports, other members of the survey staff contributed chapters. 
Chapter II contains a concise preliminary statement of conclusions 
and recommendations which grow out of the statistical evidence sub- 
mitted in the following chapters. In addition to the tabulations, 
graphical representations, and the other matter usually found in such 
reports, there is a long chapter on the criteria for evaluating tests as a 
basis for grouping elementary school pupils. This chapter is designed 
for the critical reader and gives the statistical basis of statements 
made in the following chapter. While most of the conclusions are of 
local interest, the data assembled in the volume are worth careful study 
and is a valuable addition to survey literature. Students of Education 
in other southern states will find in this volume suggestions for the 
solution of their problems. 


L. Z. 





3. Light on Some Aspects of Education in England.—Students of 
comparative education will find in this addition! to the ‘‘ Modern 
Educator’s Library” a brief exposition of the chief features, principles 
and ideals in English education as exemplified in the organization and 
curricula of schools. The material will be much more readable to 
those who have acquired in some previous experience, the English 
connotation of such terms as ‘ Public School,” ‘Elementary Educa- 
tion,’”’ not to mention “‘ vulgar fractions,”’and such grouped modifiers 
as “Ordinary Public Elementary School.’”’ A glossary of English 
educational terms with American equivalents would save much 


1Sleight, W. G.: “The Organization and Curricula of Schools.” ‘“‘The Modern 
Educators’ Library.”” New York: Longmans, Green and Co.; London: Edward 
Arnold, 1920, p. 264. 
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descriptive matter and help the American reader to sense the situation 
described. 

A brief historical introduction is followed by two chapters on 
organization of schools and one on buildings and equipment. Chapter 
VI deals with principles underlying the curriculum and is followed 
by four chapters dealing with particular aspects of curricula. One 
chapter is given to the discussion of a flexible curriculum, the feasi- 
bility of which would be enhanced by the general adoption of a “ mini- 
mum curriculum of fundamentals.” A chapter is given over to the 
presentation, analysis and criticism of “‘time tables”’ or class programs. 
Some of the evaluations show that differences between English and 
American standards lead to widely divergent conclusions with reference 
to the same data. 

Some of the tabulations are not headed, labelled or interpreted, 
and the only indication of what they represent must be sought in the 
adjoining pages. There is a chapter on teacher training, classification 
and other administrative problems, only part of which is factual. 
Chapter XI discusses the psychological foundations of school govern- 
ment at some length. The next chapter is given over to brief de- 
scriptions of the status of education in other lands. This is followed 
by a discussion of the implications of the “‘ Education Act of 1918.’’ 
The book contains a classified bibliography of pertinent educational 
literature. 

L. Z. 
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